Glad you sorted it out! Please do tell... On 17/02/2010, at 4:59 PM, James Baldassari <ja...@dataxu.com> wrote:
> Hi, > > I think we managed to solve our performance and load issues. > Everything > has been stable for about an hour now, but I'm not going to raise the > victory flag until the morning because we've had short periods of > stability in the past. > > I've been working on this problem non-stop for almost a week now, so I > really need to get some sleep, but if everything looks good tomorrow > I'll write up a summary of all the changes we made and share it with > the > group. Hopefully this exercise in tuning for a high-throughput > real-time environment will be useful to others. > > Thanks, > James > > > On Tue, 2010-02-16 at 23:18 -0600, Stack wrote: >> When you look at top on the loaded server is it the regionserver or >> the datanode that is using up the cpu? >> >> I look at your hdfs listing. Some of the regions have 3 and 4 files >> but most look fine. A good few are on the compaction verge so I'd >> imagine a lot of compaction going on but this is background though it >> does suck cpu and i/o... it shouldn't be too bad. >> >> I took a look at the regionserver log. The server is struggling >> during which time period? There is one log run at the start and >> there >> it seems like nothing untoward. Please enable DEBUG going forward. >> It'll shed more light on whats going on: See >> http://wiki.apache.org/hadoop/Hbase/FAQ#A5 for how. Otherwise, the >> log doesn't have anything running long enough for it to have been >> under serious load. >> >> This is a four node cluster now? You don't seem to have too many >> regions per server yet you have a pretty high read/write rate going >> by >> earlier requests postings. Maybe you need to add more servers. Are >> you going to add in those 16G machines? >> >> When you look at the master ui, you can see that the request rate >> over >> time is about the same for all regionservers? (refresh the master ui >> every so often to take a new sampling). >> >> St.Ack >> >> >> >> >> On Tue, Feb 16, 2010 at 3:59 PM, James Baldassari >> <ja...@dataxu.com> wrote: >>> Nope. We don't do any map reduce. We're only using Hadoop for >>> HBase at >>> the moment. >>> >>> That one node, hdfs02, still has a load of 16 with around 40% I/O >>> and >>> 120% CPU. The other nodes are all around 66% CPU with 0-1% I/O >>> and load >>> of 1 to 3. >>> >>> I don't think all the requests are going to hdfs02 based on the >>> status >>> 'detailed' output. It seems like that node is just having a much >>> harder >>> time getting the data or something. Maybe we have some incorrect >>> HDFS >>> setting. All the configs are identical, though. >>> >>> -James >>> >>> >>> On Tue, 2010-02-16 at 17:45 -0600, Dan Washusen wrote: >>>> You mentioned in a previous email that you have a Task Tracker >>>> process >>>> running on each of the nodes. Is there any chance there is a map >>>> reduce job >>>> running? >>>> >>>> On 17 February 2010 10:31, James Baldassari <ja...@dataxu.com> >>>> wrote: >>>> >>>>> On Tue, 2010-02-16 at 16:45 -0600, Stack wrote: >>>>>> On Tue, Feb 16, 2010 at 2:25 PM, James Baldassari <ja...@dataxu.com >>>>>> > >>>>> wrote: >>>>>>> On Tue, 2010-02-16 at 14:05 -0600, Stack wrote: >>>>>>>> On Tue, Feb 16, 2010 at 10:50 AM, James Baldassari <ja...@dataxu.com >>>>>>>> > >>>>> wrote: >>>>>>> >>>>>>> Whether the keys themselves are evenly distributed is another >>>>>>> matter. >>>>>>> Our keys are user IDs, and they should be fairly random. If >>>>>>> we do a >>>>>>> status 'detailed' in the hbase shell we see the following >>>>>>> distribution >>>>>>> for the value of "requests" (not entirely sure what this value >>>>>>> means): >>>>>>> hdfs01: 7078 >>>>>>> hdfs02: 5898 >>>>>>> hdfs03: 5870 >>>>>>> hdfs04: 3807 >>>>>>> >>>>>> That looks like they are evenly distributed. Requests are how >>>>>> many >>>>>> hits a second. See the UI on master port 60010. The numbers >>>>>> should >>>>>> match. >>>>> >>>>> So the total across all 4 region servers would be 22,653/ >>>>> second? Hmm, >>>>> that doesn't seem too bad. I guess we just need a little more >>>>> throughput... >>>>> >>>>>> >>>>>> >>>>>>> There are no order of magnitude differences here, and the >>>>>>> request count >>>>>>> doesn't seem to map to the load on the server. Right now >>>>>>> hdfs02 has a >>>>>>> load of 16 while the 3 others have loads between 1 and 2. >>>>>> >>>>>> >>>>>> This is interesting. I went back over your dumps of cache >>>>>> stats above >>>>>> and the 'loaded' servers didn't have any attribute there that >>>>>> differentiated it from others. For example, the number of >>>>>> storefiles >>>>>> seemed about same. >>>>>> >>>>>> I wonder what is making for the high load? Can you figure it? >>>>>> Is it >>>>>> high CPU use (unlikely). Is it then high i/o? Can you try and >>>>>> figure >>>>>> whats different about the layout under the loaded server and >>>>>> that of >>>>>> an unloaded server? Maybe do a ./bin/hadoop fs -lsr /hbase and >>>>>> see if >>>>>> anything jumps out at you. >>>>> >>>>> It's I/O wait that is killing the highly loaded server. The CPU >>>>> usage >>>>> reported by top is just about the same across all servers >>>>> (around 100% >>>>> on an 8-core node), but one server at any given time has a much >>>>> higher >>>>> load due to I/O. >>>>> >>>>>> >>>>>> If you want to post the above or a loaded servers log to >>>>>> pastbin we'll >>>>>> take a looksee. >>>>> >>>>> I'm not really sure what to look for, but maybe someone else >>>>> will notice >>>>> something, so here's the output of hadoop fs -lsr /hbase: >>>>> http://pastebin.com/m98096de >>>>> >>>>> And here is today's region server log from hdfs02, which seems >>>>> to get >>>>> hit particularly hard: http://pastebin.com/m1d8a1e5f >>>>> >>>>> Please note that we restarted it several times today, so some of >>>>> those >>>>> errors are probably just due to restarting the region server. >>>>> >>>>>> >>>>>> >>>>>> Applying >>>>>>> HBASE-2180 did not make any measurable difference. There are >>>>>>> no errors >>>>>>> in the region server logs. However, looking at the Hadoop >>>>>>> datanode >>>>>>> logs, I'm seeing lots of these: >>>>>>> >>>>>>> 2010-02-16 17:07:54,064 ERROR >>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: >>>>> DatanodeRegistration( >>>>> 10.24.183.165:50010, >>>>> storageID=DS-1519453437-10.24.183.165-50010-1265907617548, >>>>> infoPort=50075, >>>>> ipcPort=50020):DataXceiver >>>>>>> java.io.EOFException >>>>>>> at java.io.DataInputStream.readShort >>>>>>> (DataInputStream.java:298) >>>>>>> at >>>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run >>>>> (DataXceiver.java:79) >>>>>>> at java.lang.Thread.run(Thread.java:619) >>>>>> >>>>>> You upped xceivers on your hdfs cluster? If you look at >>>>>> otherend of >>>>>> the above EOFE, can you see why it died? >>>>> >>>>> Max xceivers = 3072; datanode handler count = 20; region server >>>>> handler >>>>> count = 100 >>>>> >>>>> I can't find the other end of the EOFException. I looked in the >>>>> Hadoop >>>>> and HBase logs on the server that is the name node and HBase >>>>> master, as >>>>> well as the on HBase client. >>>>> >>>>> Thanks for all the help! >>>>> >>>>> -James >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> However, I do think it's strange that >>>>>>> the load is so unbalanced on the region servers. >>>>>>> >>>>>> >>>>>> I agree. >>>>>> >>>>>> >>>>>>> We're also going to try throwing some more hardware at the >>>>>>> problem. >>>>>>> We'll set up a new cluster with 16-core, 16G nodes to see if >>>>>>> they are >>>>>>> better able to handle the large number of client requests. We >>>>>>> might >>>>>>> also decrease the block size to 32k or lower. >>>>>>> >>>>>> Ok. >>>>>> >>>>>>>> Should only be a matter if you intend distributing the above. >>>>>>> >>>>>>> This is probably a topic for a separate thread, but I've never >>>>>>> seen a >>>>>>> legal definition for the word "distribution." How does this >>>>>>> apply to >>>>>>> the SaaS model? >>>>>>> >>>>>> Fair enough. >>>>>> >>>>>> Something is up. Especially if hbase-2180 made no difference. >>>>>> >>>>>> St.Ack >>>>> >>>>> >>> >>> >