> On 19 Jan 2016, at 16:12, Boric Tan <it.news.tre...@gmail.com> wrote: > > Hi there, > > I am new to Spark, and would like to get some help to understand if Spark can > utilize the underlying architectures for better performance. If so, how does > it do it? > > For example, assume there is a cluster built with machines of different CPUs, > will Spark check the individual CPU information and use some machine-specific > setting for the tasks assigned to that machine? Or is it totally dependent on > the underlying JVM implementation to run the JAR file, and therefor the JVM > is the place to check if certain CPU features can be used? > > Thanks, > Boric
You can't control where work is done based on CPU parts. Ideally your cluster is homogenous, or at least vary only in CPU performance and memory. If some of your systems have GPUs and some don't, then in a YARN cluster, label the GPU parts and then use yarn queues or spark-submit to schedule the work only on those GPU systems. The native libraries you load into JVMs are generally where CPU checks and features (e.g. x86 AES opcodes for encrypt/decrypt) would go --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org