subject:"PCA OutOfMemoryError"

Re: PCA OutOfMemoryError

2016-01-17 Thread Bharath Ravi Kumar

Hello Alex, Thanks for the response. There isn't much other data on the driver, so the issue is probably inherent to this particular PCA implementation. I'll try the alternative approach that you suggested instead. Thanks again. -Bharath On Wed, Jan 13, 2016 at 11:24 PM, Alex Gittens wrote: >

Re: PCA OutOfMemoryError

2016-01-13 Thread Alex Gittens

The PCA.fit function calls the RowMatrix PCA routine, which attempts to construct the covariance matrix locally on the driver, and then computes the SVD of that to get the PCs. I'm not sure what's causing the memory error: RowMatrix.scala:124 is only using 3.5 GB of memory (n*(n+1)/2 with n=29604 a

Re: PCA OutOfMemoryError

2016-01-12 Thread Bharath Ravi Kumar

Any suggestion/opinion? On 12-Jan-2016 2:06 pm, "Bharath Ravi Kumar" wrote: > We're running PCA (selecting 100 principal components) on a dataset that > has ~29K columns and is 70G in size stored in ~600 parts on HDFS. The > matrix in question is mostly sparse with tens of columns populate in mos

PCA OutOfMemoryError

2016-01-12 Thread Bharath Ravi Kumar

We're running PCA (selecting 100 principal components) on a dataset that has ~29K columns and is 70G in size stored in ~600 parts on HDFS. The matrix in question is mostly sparse with tens of columns populate in most rows, but a few rows with thousands of columns populated. We're running spark on m

Re: PCA OutOfMemoryError

Re: PCA OutOfMemoryError

Re: PCA OutOfMemoryError

PCA OutOfMemoryError

4 matches

Site Navigation

Mail list logo

Footer information