Just as a quick reminder regarding what Todd mentioned, that's exactly what was happening in this case study...
http://hbase.apache.org/book.html#casestudies.slownode ... although it doesn't appear to be the problem in this particular situation. On 3/29/12 8:22 PM, "Juhani Connolly" <[email protected]> wrote: >On Fri, Mar 30, 2012 at 7:36 AM, Todd Lipcon <[email protected]> wrote: >> On the other hand, I've seen that "frame errors" are often correlated >> with NICs auto-negotiating to the wrong speed, etc. Double check with >> ethtool that all of your machines are gigabit full-duplex and not >> doing something strange. Also double check your bonding settings, etc. >> >> -Todd >> > >I did this after seeing the errors on ifconfig, but everything looks >ok on that front: >Settings for eth0: > Supported ports: [ TP ] > Supported link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > Supports auto-negotiation: Yes > Advertised link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > Advertised auto-negotiation: Yes > Speed: 1000Mb/s > Duplex: Full > Port: Twisted Pair > PHYAD: 1 > Transceiver: internal > Auto-negotiation: on > Supports Wake-on: g > Wake-on: d > Link detected: yes > >Also, since yesterday the error counts have not increased at all so I >guess that was just a red herring... > > >> 2012/3/28 Dave Wang <[email protected]>: >>> As you said, the amount of errors and drops you are seeing are very >>>small >>> compared to your overall traffic, so I doubt that is a significant >>> contributor to the throughput problems you are seeing. >>> >>> - Dave >>> >>> On Wed, Mar 28, 2012 at 7:36 PM, Juhani Connolly < >>> [email protected]> wrote: >>> >>>> Ron, >>>> >>>> thanks for sharing those settings. Unfortunately they didn't help >>>>with our >>>> read throughput, but every little bit helps. >>>> >>>> Another suspicious thing that has come up is with the network... While >>>> overall throughput has been verified to be able to go much higher >>>>than the >>>> tax hbase is putting on it right now, there seem to be errors and >>>>dropped >>>> packets(though this is relative to a massive amount of traffic): >>>> >>>> [juhani_connolly@hornet-**slave01 ~]$ sudo /sbin/ifconfig bond0 >>>> パスワード: >>>> bond0 Link encap:Ethernet HWaddr 78:2B:CB:59:A9:34 >>>> inet addr:******** Bcast:********** Mask:255.255.0.0 >>>> inet6 addr: fe80::7a2b:cbff:fe59:a934/64 Scope:Link >>>> UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 >>>> RX packets:9422705447 errors:605 dropped:6222 overruns:0 frame:605 >>>> TX packets:9317689449 errors:0 dropped:0 overruns:0 carrier:0 >>>> collisions:0 txqueuelen:0 >>>> RX bytes:6609813756075 (6.0 TiB) TX bytes:6033761947482 (5.4 TiB) >>>> >>>> could this possibly be a problem cause? >>>> Since we haven't heard anything on expected throughput we're >>>>downgrading >>>> our hdfs back to 0.20.2, I'd be curious to hear how other people do >>>>with >>>> 0.23 and the throughput they're getting. >>>> >>>> >>>> On 03/29/2012 02:56 AM, Buckley,Ron wrote: >>>> >>>>> Stack, >>>>> >>>>> We're about 80% random read and 20% random write. So, that would have >>>>> been the mix that we were running. >>>>> >>>>> We'll try a test with Nagel On and then Nagel off, random write only, >>>>> later this afternoon and see if the same pattern emerges. >>>>> >>>>> Ron >>>>> >>>>> -----Original Message----- >>>>> From: [email protected] [mailto:[email protected]] On Behalf Of >>>>>Stack >>>>> Sent: Wednesday, March 28, 2012 1:12 PM >>>>> To: [email protected] >>>>> Subject: Re: 0.92 and Read/writes not scaling >>>>> >>>>> On Wed, Mar 28, 2012 at 5:41 AM, Buckley,Ron<[email protected]> >>>>>wrote: >>>>> >>>>>> For us, setting these two, got rid of all of the 20 and 40 ms >>>>>>response >>>>>> times and dropped the average response time we measured from HBase >>>>>>by >>>>>> more than half. Plus, we can push HBase a lot harder. >>>>>> >>>>>> That had an effect on random read workload only Ron? >>>>> Thanks, >>>>> St.Ack >>>>> >>>>> >>>>> >>>>> >>>> >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >
