RE: Spark enables us to process Big Data on an ARM cluster !!

Xia, Junluan Wed, 19 Mar 2014 07:39:36 -0700

Very cool!

-----Original Message-----
From: Chanwit Kaewkasi [mailto:chan...@gmail.com] 
Sent: Wednesday, March 19, 2014 10:36 AM
To: user@spark.apache.org
Subject: Spark enables us to process Big Data on an ARM cluster !!


Hi all,

We are a small team doing a research on low-power (and low-cost) ARM clusters. 
We built a 20-node ARM cluster that be able to start Hadoop.
But as all of you've known, Hadoop is performing on-disk operations, so it's 
not suitable for a constraint machine powered by ARM.

We then switched to Spark and had to say wow!!

Spark / HDFS enables us to crush Wikipedia articles (of year 2012) of size 34GB 
in 1h50m. We have identified the bottleneck and it's our 100M network.

Here's the cluster:
https://dl.dropboxusercontent.com/u/381580/aiyara_cluster/Mk-I_SSD.png

And this is what we got from Spark's shell:
https://dl.dropboxusercontent.com/u/381580/aiyara_cluster/result_00.png

I think it's the first ARM cluster that can process a non-trivial size of Big 
Data.
(Please correct me if I'm wrong)
I really want to thank the Spark team that makes this possible !!

Best regards,

-chanwit

--
Chanwit Kaewkasi
linkedin.com/in/chanwit

RE: Spark enables us to process Big Data on an ARM cluster !!

Reply via email to