swappinness at 0 is good, but also don't overcommit your memory! J-D
On Wed, Jul 7, 2010 at 10:53 AM, Jamie Cockrill <[email protected]> wrote: > I think you're right. > > Unfortunately the machines are on a separate network to this laptop, > so I'm having to type everything across, apologies if it doesn't > translate well... > > free -m gave: > > Mem Total Used Free > 7992 7939 53 > b/c 7877 114 > Swap: 23415 895 22519 > > I did this on another node that isn't being smashed at the moment and > the numbers came out similar, but the buffers/cache free was higher > > vmstat -20 is giving non-zero si and so's ranging between 3 and just > short of 5000. > > That seems to be it I guess. Hadoop troubleshooting suggests setting > swappiness to 0, is that just a case of changing the value in > /proc/sys/vm/swappiness? > > thanks > > Jamie > > > > > On 7 July 2010 18:40, Todd Lipcon <[email protected]> wrote: >> On Wed, Jul 7, 2010 at 10:32 AM, Jamie Cockrill >> <[email protected]>wrote: >> >>> On the subject of GC and heap, I've left those as defaults. I could >>> look at those if that's the next logical step? Would there be anything >>> in any of the logs that I should look at? >>> >>> One thing I have noticed is that it does take an absolute age to log >>> in to the DN/RS to restart the RS once it's fallen over, in one >>> instance it took about 10 minutes. These are 8GB, 4 core amd64 boxes >>> >>> >> That indicates swapping. Can you run "free -m" on the node? >> >> Also let "vmstat 20" run while running your job and observe the "si" and >> "so" columns. If those are nonzero, it indicates you're swapping, and you've >> oversubscribed your RAM (very easy on 8G machines) >> >> -Todd >> >> >> >>> ta >>> >>> Jamie >>> >>> >>> >>> On 7 July 2010 18:30, Jamie Cockrill <[email protected]> wrote: >>> > Bad news, it looks like my xcievers is set as it should be, it's in >>> > the hdfs-site.xml and looking at the job.xml of one of my jobs in the >>> > job-tracker, it's showing that property as set to 2047. I've cat | >>> > grepped one of the datanode logs and although there were a few in >>> > there, they were from a few months ago. I've upped my MAX_FILESIZE on >>> > my table to 1GB to see if that helps (not sure if it will!). >>> > >>> > Thanks, >>> > >>> > Jamie >>> > >>> > On 7 July 2010 18:12, Jean-Daniel Cryans <[email protected]> wrote: >>> >> xcievers exceptions will be in the datanodes' logs, and your problem >>> >> totally looks like it. 0.20.5 will have the same issue (since it's on >>> >> the HDFS side) >>> >> >>> >> J-D >>> >> >>> >> On Wed, Jul 7, 2010 at 10:08 AM, Jamie Cockrill >>> >> <[email protected]> wrote: >>> >>> Hi Todd & JD, >>> >>> >>> >>> Environment: >>> >>> All (hadoop and HBase) installed as of karmic-cdh3, which means: >>> >>> Hadoop 0.20.2+228 >>> >>> HBase 0.89.20100621+17 >>> >>> Zookeeper 3.3.1+7 >>> >>> >>> >>> Unfortunately my whole cluster of regionservers have now crashed, so I >>> >>> can't really say if it was swapping too much. There is a DEBUG >>> >>> statement just before it crashes saying: >>> >>> >>> >>> org.apache.hadoop.hbase.regionserver.wal.HLog: closing hlog writer in >>> >>> hdfs://<somewhere on my HDFS, in /hbase> >>> >>> >>> >>> What follows is: >>> >>> >>> >>> WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: >>> >>> org.apache.hadoop.ipc.RemoteException: >>> >>> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease >>> >>> on <file location as above> File does not exist. Holder >>> >>> DFSClient_-11113603 does not have any open files >>> >>> >>> >>> It then seems to try and do some error recovery (Error Recovery for >>> >>> block null bad datanode[0] nodes == null), fails (Could not get block >>> >>> locations. Source file "<hbase file as before>" - Aborting). There is >>> >>> then an ERROR org.apache...HRegionServer: Close and delete failed. >>> >>> There is then a similar LeaseExpiredException as above. >>> >>> >>> >>> There are then a couple of messages from HRegionServer saying that >>> >>> it's notifying master of its shutdown and stopping itself. The >>> >>> shutdown hook then fires and the RemoteException and >>> >>> LeaseExpiredExceptions are printed again. >>> >>> >>> >>> ulimit is set to 65000 (it's in the regionserver log, printed as I >>> >>> restarted the regionserver), however I haven't got the xceivers set >>> >>> anywhere. I'll give that a go. It does seem very odd as I did have a >>> >>> few of them fall over one at a time with a few early loads, but that >>> >>> seemed to be because the regions weren't splitting properly, so all >>> >>> the traffic was going to one node and it was being overwhelmed. Once I >>> >>> throttled it, after one load it a region split seemed to get >>> >>> triggered, which flung regions all over, which made subsequent loads >>> >>> much more distributed. However, perhaps the time-bomb was ticking... >>> >>> I'll have a go at specifying the xcievers property. I'm pretty >>> >>> certain i've got everything else covered, except the patches as >>> >>> referenced in the JIRA. >>> >>> >>> >>> I just grepped some of the log files and didn't get an explicit >>> >>> exception with 'xciever' in it. >>> >>> >>> >>> I am considering downgrading(?) to 0.20.5, however because everything >>> >>> is installed as per karmic-cdh3, I'm a bit reluctant to do so as >>> >>> presumably Cloudera has tested each of these versions against each >>> >>> other? And I don't really want to introduce further versioning issues. >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Jamie >>> >>> >>> >>> >>> >>> On 7 July 2010 17:30, Jean-Daniel Cryans <[email protected]> wrote: >>> >>>> Jamie, >>> >>>> >>> >>>> Does your configuration meets the requirements? >>> >>>> >>> http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#requirements >>> >>>> >>> >>>> ulimit and xcievers, if not set, are usually time bombs that blow off >>> when >>> >>>> the cluster is under load. >>> >>>> >>> >>>> J-D >>> >>>> >>> >>>> On Wed, Jul 7, 2010 at 9:11 AM, Jamie Cockrill < >>> [email protected]>wrote: >>> >>>> >>> >>>>> Dear all, >>> >>>>> >>> >>>>> My current HBase/Hadoop architecture has HBase region servers on the >>> >>>>> same physical boxes as the HDFS data-nodes. I'm getting an awful lot >>> >>>>> of region server crashes. The last thing that happens appears to be a >>> >>>>> DroppedSnapshot Exception, caused by an IOException: could not >>> >>>>> complete write to file <file on HDFS>. I am running it under load, >>> how >>> >>>>> heavy that is I'm not sure how that is quantified, but I'm guessing >>> it >>> >>>>> is a load issue. >>> >>>>> >>> >>>>> Is it common practice to put region servers on data-nodes? Is it >>> >>>>> common to see region server crashes when either the HDFS or region >>> >>>>> server (or both) is under heavy load? I'm guessing that is the case >>> as >>> >>>>> I've seen a few similar posts. I've not got a great deal of capacity >>> >>>>> to be separating region servers from HDFS data nodes, but it might be >>> >>>>> an argument I could make. >>> >>>>> >>> >>>>> Thanks >>> >>>>> >>> >>>>> Jamie >>> >>>>> >>> >>>> >>> >>> >>> >> >>> > >>> >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> >
