There are no bandwidth limitations in 0.20.x. None that I saw at least. It was basically bandwidth-management-by-pwm. You could adjust the frequency of how many files-per-node were copied.
In my case, the load was HBase real time serving, so it was servicing more smaller random reads, not a map-reduce. Everyone has their own use case :-) -ryan On Mon, Jun 27, 2011 at 6:54 PM, Segel, Mike <[email protected]> wrote: > That doesn't seem right. > In one of our test clusters (19 data nodes) we found that under heavy loads > we were disk I/O bound and not network bound. Of course YMMV depending on > your ToR switch. If we had more than 4 disks per node, we would probably see > the network being the bottleneck. What did you set your bandwidth settings in > the hdfs-site.xml? ( going from memory not sure of the exact setting...) > > But the good news... Newer hardware will start to have 10GBe on the > motherboard. > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On Jun 27, 2011, at 7:11 PM, "Ryan Rawson" <[email protected]> wrote: > >> On the subject of gige vs 10-gige, I think that we will very shortly >> be seeing interest in 10gig, since gige is only 120MB/sec - 1 hard >> drive of streaming data. Nodes with 4+ disks are throttled by the >> network. On a small cluster (20 nodes), the replication traffic can >> choke a cluster to death. The only way to fix quickly it is to bring >> that node back up. Perhaps the HortonWorks guys can work on that. >> >> -ryan >> >> On Mon, Jun 27, 2011 at 4:38 AM, Steve Loughran <[email protected]> wrote: >>> On 26/06/11 20:23, Scott Carey wrote: >>>> >>>> >>>> On 6/23/11 5:49 AM, "Steve Loughran"<[email protected]> wrote: >>>> >>> >>>>> what's your HW setup? #cores/server, #servers, underlying OS? >>>> >>>> CentOS 5.6. >>>> 4 cores / 8 threads a server (Nehalem generation Intel processor). >>> >>> >>> that should be enough to find problems. I've just moved up to a 6-core 12 >>> thread desktop and that found problems on some non-Hadoop code, which shows >>> that the more threads you have, and the faster the machines are, the more >>> your race conditions show up. With Hadoop the fact that you can have 10-1000 >>> servers means that in a large cluster the probability of that race condition >>> showing up scales well. >>> >>>> Also run a smaller cluster with 2x quad core Core 2 generation Xeons. >>>> >>>> Off topic: >>>> The single proc Nehalem is faster than the dual core 2's for most use >>>> cases -- and much lower power. Looking forward to single proc 4 or 6 core >>>> Sandy Bridge based systems for the next expansion -- testing 4 core vs 4 >>>> core has these 30% faster than the Nehalem generation systems in CPU bound >>>> tasks and lower power. Intel prices single socket Xeons so much lower >>>> than the Dual socket ones that the best value for us is to get more single >>>> socket servers rather than fewer dual socket ones (with similar processor >>>> to hard drive ratio). >>> >>> Yes, in a large cluster the price of filling the second socket can compare >>> to a lot of storage, and TB of storage is more tangible. I guess it depends >>> on your application. >>> >>> Regarding Sandy Bridge, I've no experience of those, but I worry that 10 >>> Gbps is still bleeding edge, and shouldn't be needed for code with good >>> locality anyway; it is probably more cost effective to stay at 1Gbps/server, >>> though the issue there is the #of HDD/s server generates lots of replication >>> traffic when a single server fails... >>> > > > The information contained in this communication may be CONFIDENTIAL and is > intended only for the use of the recipient(s) named above. If you are not > the intended recipient, you are hereby notified that any dissemination, > distribution, or copying of this communication, or any of its contents, is > strictly prohibited. If you have received this communication in error, > please notify the sender and delete/destroy the original message and any copy > of it from your computer or paper files. >
