Re: MLLib: LinearRegressionWithSGD performance

Jayant Shekhar Fri, 21 Nov 2014 12:05:41 -0800

Hi Sameer,

You can try increasing the number of executor-cores.


-Jayant





On Fri, Nov 21, 2014 at 11:18 AM, Sameer Tilak <ssti...@live.com> wrote:

> Hi All,
> I have been using MLLib's linear regression and I have some question
> regarding the performance. We have a cluster of 10 nodes -- each node has
> 24 cores and 148GB memory. I am running my app as follows:
>
> time spark-submit --class medslogistic.MedsLogistic --master yarn-client
> --executor-memory 6G --num-executors 10 /pathtomyapp/myapp.jar
>
> I am also going to play with number of executors (reduce it) may be that
> will give us different results.
>
> The input is a 800MB sparse file in LibSVNM format. Total number of
> features is 150K. It takes approximately 70 minutes for the regression to
> finish. The job imposes very little load on CPU, memory, network, and disk. 
> Total
> number of tasks is 104.  Total time gets divided fairly uniformly across
> these tasks each task. I was wondering, is it possible to reduce the
> execution time further?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

Re: MLLib: LinearRegressionWithSGD performance

Reply via email to