Re: HBase on same boxes as HDFS Data nodes

Jamie Cockrill Wed, 07 Jul 2010 11:23:39 -0700

PS, I've now reset my MAX_FILESIZE back to the default.  (from the 1GB
i raised it to). It caused me to run into a delightful
'YouAreDeadException' which looks very related to the Garbage
collection issues on the Troubleshooting page, as my Zookeeper session
expired.


Thanks

Jamie



On 7 July 2010 19:19, Jamie Cockrill <[email protected]> wrote:
> By overcommit, do you mean make my overcommit_ratio higher on each box
> (its at the default 50 at the moment)? What I'm noticing at the moment
> is that hadoop is taking up the vast majority of the memory on the
> boxes.
>
> I found this article:
> http://blog.rapleaf.com/dev/2010/01/05/the-wrath-of-drwho-or-unpredictable-hadoop-memory-usage/
> which Todd, it looks like you replied to. Does this sound like a
> similar problem? No worries if you can't remember, it was back in
> january! This article suggests reducing the amount of memory allocated
> to Hadoop at startup, how would I go about doing this?
>
> Thank you everyone for your patience so far. Sorry if this is taking
> up a lot of your time.
>
> Thanks,
>
> Jamie
>
> On 7 July 2010 19:03, Jean-Daniel Cryans <[email protected]> wrote:
>> swappinness at 0 is good, but also don't overcommit your memory!
>>
>> J-D
>>
>> On Wed, Jul 7, 2010 at 10:53 AM, Jamie Cockrill
>> <[email protected]> wrote:
>>> I think you're right.
>>>
>>> Unfortunately the machines are on a separate network to this laptop,
>>> so I'm having to type everything across, apologies if it doesn't
>>> translate well...
>>>
>>> free -m gave:
>>>
>>> Mem    Total    Used     Free
>>>            7992     7939      53
>>> b/c                    7877    114
>>> Swap: 23415       895  22519
>>>
>>> I did this on another node that isn't being smashed at the moment and
>>> the numbers came out similar, but the buffers/cache free was higher
>>>
>>> vmstat -20 is giving non-zero si and so's ranging between 3 and just
>>> short of 5000.
>>>
>>> That seems to be it I guess. Hadoop troubleshooting suggests setting
>>> swappiness to 0, is that just a case of changing the value in
>>> /proc/sys/vm/swappiness?
>>>
>>> thanks
>>>
>>> Jamie
>>>
>>>
>>>
>>>
>>> On 7 July 2010 18:40, Todd Lipcon <[email protected]> wrote:
>>>> On Wed, Jul 7, 2010 at 10:32 AM, Jamie Cockrill 
>>>> <[email protected]>wrote:
>>>>
>>>>> On the subject of GC and heap, I've left those as defaults. I could
>>>>> look at those if that's the next logical step? Would there be anything
>>>>> in any of the logs that I should look at?
>>>>>
>>>>> One thing I have noticed is that it does take an absolute age to log
>>>>> in to the DN/RS to restart the RS once it's fallen over, in one
>>>>> instance it took about 10 minutes. These are 8GB, 4 core amd64 boxes
>>>>>
>>>>>
>>>> That indicates swapping. Can you run "free -m" on the node?
>>>>
>>>> Also let "vmstat 20" run while running your job and observe the "si" and
>>>> "so" columns. If those are nonzero, it indicates you're swapping, and 
>>>> you've
>>>> oversubscribed your RAM (very easy on 8G machines)
>>>>
>>>> -Todd
>>>>
>>>>
>>>>
>>>>> ta
>>>>>
>>>>> Jamie
>>>>>
>>>>>
>>>>>
>>>>> On 7 July 2010 18:30, Jamie Cockrill <[email protected]> wrote:
>>>>> > Bad news, it looks like my xcievers is set as it should be, it's in
>>>>> > the hdfs-site.xml and looking at the job.xml of one of my jobs in the
>>>>> > job-tracker, it's showing that property as set to 2047. I've cat |
>>>>> > grepped one of the datanode logs and although there were a few in
>>>>> > there, they were from a few months ago. I've upped my MAX_FILESIZE on
>>>>> > my table to 1GB to see if that helps (not sure if it will!).
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > Jamie
>>>>> >
>>>>> > On 7 July 2010 18:12, Jean-Daniel Cryans <[email protected]> wrote:
>>>>> >> xcievers exceptions will be in the datanodes' logs, and your problem
>>>>> >> totally looks like it. 0.20.5 will have the same issue (since it's on
>>>>> >> the HDFS side)
>>>>> >>
>>>>> >> J-D
>>>>> >>
>>>>> >> On Wed, Jul 7, 2010 at 10:08 AM, Jamie Cockrill
>>>>> >> <[email protected]> wrote:
>>>>> >>> Hi Todd & JD,
>>>>> >>>
>>>>> >>> Environment:
>>>>> >>> All (hadoop and HBase) installed as of karmic-cdh3, which means:
>>>>> >>> Hadoop 0.20.2+228
>>>>> >>> HBase 0.89.20100621+17
>>>>> >>> Zookeeper 3.3.1+7
>>>>> >>>
>>>>> >>> Unfortunately my whole cluster of regionservers have now crashed, so I
>>>>> >>> can't really say if it was swapping too much. There is a DEBUG
>>>>> >>> statement just before it crashes saying:
>>>>> >>>
>>>>> >>> org.apache.hadoop.hbase.regionserver.wal.HLog: closing hlog writer in
>>>>> >>> hdfs://<somewhere on my HDFS, in /hbase>
>>>>> >>>
>>>>> >>> What follows is:
>>>>> >>>
>>>>> >>> WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
>>>>> >>> org.apache.hadoop.ipc.RemoteException:
>>>>> >>> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease
>>>>> >>> on <file location as above> File does not exist. Holder
>>>>> >>> DFSClient_-11113603 does not have any open files
>>>>> >>>
>>>>> >>> It then seems to try and do some error recovery (Error Recovery for
>>>>> >>> block null bad datanode[0] nodes == null), fails (Could not get block
>>>>> >>> locations. Source file "<hbase file as before>" - Aborting). There is
>>>>> >>> then an ERROR org.apache...HRegionServer: Close and delete failed.
>>>>> >>> There is then a similar LeaseExpiredException as above.
>>>>> >>>
>>>>> >>> There are then a couple of messages from HRegionServer saying that
>>>>> >>> it's notifying master of its shutdown and stopping itself. The
>>>>> >>> shutdown hook then fires and the RemoteException and
>>>>> >>> LeaseExpiredExceptions are printed again.
>>>>> >>>
>>>>> >>> ulimit is set to 65000 (it's in the regionserver log, printed as I
>>>>> >>> restarted the regionserver), however I haven't got the xceivers set
>>>>> >>> anywhere. I'll give that a go. It does seem very odd as I did have a
>>>>> >>> few of them fall over one at a time with a few early loads, but that
>>>>> >>> seemed to be because the regions weren't splitting properly, so all
>>>>> >>> the traffic was going to one node and it was being overwhelmed. Once I
>>>>> >>> throttled it, after one load it a region split seemed to get
>>>>> >>> triggered, which flung regions all over, which made subsequent loads
>>>>> >>> much more distributed. However, perhaps the time-bomb was ticking...
>>>>> >>> I'll  have a go at specifying the xcievers property. I'm pretty
>>>>> >>> certain i've got everything else covered, except the patches as
>>>>> >>> referenced in the JIRA.
>>>>> >>>
>>>>> >>> I just grepped some of the log files and didn't get an explicit
>>>>> >>> exception with 'xciever' in it.
>>>>> >>>
>>>>> >>> I am considering downgrading(?) to 0.20.5, however because everything
>>>>> >>> is installed as per karmic-cdh3, I'm a bit reluctant to do so as
>>>>> >>> presumably Cloudera has tested each of these versions against each
>>>>> >>> other? And I don't really want to introduce further versioning issues.
>>>>> >>>
>>>>> >>> Thanks,
>>>>> >>>
>>>>> >>> Jamie
>>>>> >>>
>>>>> >>>
>>>>> >>> On 7 July 2010 17:30, Jean-Daniel Cryans <[email protected]> wrote:
>>>>> >>>> Jamie,
>>>>> >>>>
>>>>> >>>> Does your configuration meets the requirements?
>>>>> >>>>
>>>>> http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#requirements
>>>>> >>>>
>>>>> >>>> ulimit and xcievers, if not set, are usually time bombs that blow off
>>>>> when
>>>>> >>>> the cluster is under load.
>>>>> >>>>
>>>>> >>>> J-D
>>>>> >>>>
>>>>> >>>> On Wed, Jul 7, 2010 at 9:11 AM, Jamie Cockrill <
>>>>> [email protected]>wrote:
>>>>> >>>>
>>>>> >>>>> Dear all,
>>>>> >>>>>
>>>>> >>>>> My current HBase/Hadoop architecture has HBase region servers on the
>>>>> >>>>> same physical boxes as the HDFS data-nodes. I'm getting an awful lot
>>>>> >>>>> of region server crashes. The last thing that happens appears to be 
>>>>> >>>>> a
>>>>> >>>>> DroppedSnapshot Exception, caused by an IOException: could not
>>>>> >>>>> complete write to file <file on HDFS>. I am running it under load,
>>>>> how
>>>>> >>>>> heavy that is I'm not sure how that is quantified, but I'm guessing
>>>>> it
>>>>> >>>>> is a load issue.
>>>>> >>>>>
>>>>> >>>>> Is it common practice to put region servers on data-nodes? Is it
>>>>> >>>>> common to see region server crashes when either the HDFS or region
>>>>> >>>>> server (or both) is under heavy load? I'm guessing that is the case
>>>>> as
>>>>> >>>>> I've seen a few similar posts. I've not got a great deal of capacity
>>>>> >>>>> to be separating region servers from HDFS data nodes, but it might 
>>>>> >>>>> be
>>>>> >>>>> an argument I could make.
>>>>> >>>>>
>>>>> >>>>> Thanks
>>>>> >>>>>
>>>>> >>>>> Jamie
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Todd Lipcon
>>>> Software Engineer, Cloudera
>>>>
>>>
>>
>

Re: HBase on same boxes as HDFS Data nodes

Reply via email to