Re: Thoughts about Hadoop cluster hardware
Awesome! I appreciate it. I'm off on training right now so I'm just starting to catch up. I'll check out those servers and see how they compare thanks a bunch! On Tue, Jul 13, 2010 at 8:36 PM, Allen Wittenauer wrote: > > On Jul 13, 2010, at 5:00 PM, u235sentinel wrote: > > > So we're talking to Dell about their new PowerEdge c2100 servers for a > Hadoop cluster but I'm wondering. Isn't this still a little overboard for > nodes in a cluster? I'm wondering if we bought say 100 poweredge 2750's > instead of just 50 c2100's. The price would be about the same for the > configuration we're talking about and we would get twice as many nodes. > > Ultimately, it depends upon your job flow and how much data you have. > > FWIW we're currently using a Sun equivalent of the C2100s w/8 of the 12 > drive slots filled. You need a *LOT* of iops to make it worth while. [From > what I've seen, even people who think they have a lot of iops generally have > other problems with their code/tuning that are causing the iops. So even > if you think you have a lot, you may not.] > > > I'm curious if any other's are running Dell PowerEdge servers with > Hadoop. > > > > We've also been kicking the idea around of going with blade servers (Dell > and/or HP). > > If you are thinking traditional blade where storage is comes mainly from > NAS or SAN, you are going to be very, very unhappy unless your data set is > very, very tiny. > > Check out the PoweredBy page on the wiki. Quite a few folks list their > gear. FWIW, we're currently evaluating HP SLs and should be getting some > Dell C6100s in soon, assuming Dell can deliver the eval unit on time.
Thoughts about Hadoop cluster hardware
So we're talking to Dell about their new PowerEdge c2100 servers for a Hadoop cluster but I'm wondering. Isn't this still a little overboard for nodes in a cluster? I'm wondering if we bought say 100 poweredge 2750's instead of just 50 c2100's. The price would be about the same for the configuration we're talking about and we would get twice as many nodes. I'm curious if any other's are running Dell PowerEdge servers with Hadoop. We've also been kicking the idea around of going with blade servers (Dell and/or HP). Just curious Thanks!!
Sensage to Hadoop conversion?
Is there a way to convert sensage systems over to the hadoop store?? While we're miles away from switching, there is a growing interest and I'm going at this on my own for now :=)
Re: Does Hadoop compress files?
Ok that's what I was thinking. I was wondering if Hadoop did on the fly compression as it stored files in HDFS like Sensage does. But it sounds like Hadoop will take a compressed file and store it as compressed which is fine by me. Sensage will do that same. I believe this answers the question. Sonal's link suggests there is support for compression using zlib, gzip and bzip2. One more question though. So storing files in compressed format, any issues with searching that data? I'm curious if there is a disadvantage in doing this. I could build bigger and badder servers but was hoping for compression. Thanks Eric Sammer wrote: To clarify, there is no implicit compression in HDFS. In other words, if you want your data to be compressed, you have to write it that way. If you plan on writing map reduce jobs to process the compressed data, you'll want to use a splittable compression format. This generally means LZO or block compressed SequenceFiles which others have mentioned.
Does Hadoop compress files?
I'm starting to evaluate Hadoop. We are currently running Sensage and store a lot of log files in our current environment. I've been looking at the Hadoop forums and googling (of course) but haven't learned if Hadoop HDFS does any compression to files we store. On the average we're storing about 600 gigs a week in log files (more or less). Generally we need to store about 1 1/2 - 2 years of logs. With Sensage compression we can store about 200+ Tb of logs in our current environment. As I said, we're starting to evaluate if Hadoop would be a good replacement to our Sensage environment (or at least augment it). Thanks a bunch!!