RE: HDFS using SAN

Pamecha, Abhishek Thu, 18 Oct 2012 17:29:33 -0700

Check this out:
http://www.symantec.com/connect/articles/getting-hang-iops-v13#a12

May be this helps. I think their RAID configuration or striping is contributing 
to it. Just my guess!

Thanks,
Abhishek

From: Jitendra Kumar Singh [mailto:[email protected]]
Sent: Thursday, October 18, 2012 6:49 AM
To: [email protected]
Subject: Re: HDFS using SAN

Hi,

In the NetApp whitepaper on SAN solution (link given by Kevin) it makes 
following statement. Can someone please elaborate (or give a link that 
explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would 
give 600 IOPS?

"The E2660 can deliver up to 2,000 IOPS
from a 12-disk stripe (the bottleneck being the 12 disks). This headroom 
translates into better read times
for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA 
disks can at best
never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the 
IOPS headroom, which
translates into faster read times and high MapReduce throughput "

Thanks and Regards,
--
Jitendra Kumar Singh

On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu 
<[email protected]<mailto:[email protected]>> wrote:
On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote:
Tom

Do you mean you are using GPFS instead of HDFS? Also, if you can share,
are you deploying it as DAS set up or a SAN?

Thanks,

Abhishek

Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN and 
are using it *instead of HDFS* with a small/medium Hadoop MapReduce cluster (up 
to 100 nodes or so, depending on our need).  We still use the local node disks 
for intermediate data (mapred local storage).  Although this set-up does limit 
our possibility to scale to a large number of nodes, that's not a concern for 
us.  On the plus, we gain the flexibility to be able to share our cluster with 
non-Hadoop users at our centre.

--
Luca Pireddu
CRS4 - Distributed Computing Group
Loc. Pixina Manna Edificio 1
09010 Pula (CA), Italy
Tel: +39 0709250452

RE: HDFS using SAN

Reply via email to