Hi, John. I've had similar problems. IIRC, the driver was GCing madly. I don't know why the driver was doing so much work but I quickly implemented an alternative approach. The code I wrote belongs to my client but I wrote something that should be equivalent. It can be found at:
https://github.com/PhillHenry/Algernon It's not terribly complicated and you could roll-your-own if you prefer (the rough idea can be found at http://javaagile.blogspot.co.at/2016/11/an-alternative-approach-to-matrices-in.html). But anyway, I got good performance this way. Phill On Thu, May 11, 2017 at 10:12 PM, John Compitello <jo...@broadinstitute.org> wrote: > Hey all, > > I’ve found myself in a position where I need to do a relatively large > matrix multiply (at least, compared to what I normally have to do). I’m > looking to multiply a 100k by 500k dense matrix by its transpose to yield > 100k by 100k matrix. I’m trying to do this on Google Cloud, so I don’t have > any real limits on cluster size or memory. However, I have no idea where to > begin as far as number of cores / number of partitions / how big to make > the block size for best performance. Is there anywhere where Spark users > collect optimal configurations for methods relative to data input size? > Does anyone have any suggestions? I’ve tried throwing 900 cores at a 100k > by 100k matrix multiply with 1000 by 1000 sized blocks, and that seemed to > hang forever and eventually fail. > > Thanks , > > John > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >