Are you using SSD? We found that the bottleneck is not computational, but
disk IO. When assembly, sbt is moving lots of class files, jars, and
packaging them into a single flat jar. I can do assembly in my macbook in
10mins while before upgrading to SSD, it took 30~40mins.


Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Fri, Apr 25, 2014 at 12:53 PM, Williams, Ken <[email protected]
> wrote:

>  I’ve cloned the github repo and I’m building Spark on a pretty beefy
> machine (24 CPUs, 78GB of RAM) and it takes a pretty long time.
>
>
>
> For instance, today I did a ‘git pull’ for the first time in a week or
> two, and then doing ‘sbt/sbt assembly’ took 43 minutes of wallclock time
> (88 minutes of CPU time).  After that, I did ‘SPARK_HADOOP_VERSION=2.2.0
> SPARK_YARN=true sbt/sbt assembly’ and that took 25 minutes wallclock, 73
> minutes CPU.
>
>
>
> Is that typical?  Or does that indicate some setup problem in my
> environment?
>
>
>
> --
>
> Ken Williams, Senior Research Scientist
>
> *Wind**Logics*
>
> http://windlogics.com
>
>
>
> ------------------------------
>
> CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution of
> any kind is strictly prohibited. If you are not the intended recipient,
> please contact the sender via reply e-mail and destroy all copies of the
> original message. Thank you.
>

Reply via email to