On Mon, May 16, 2011 at 4:55 AM, Stan Barton <bartx...@gmail.com> wrote:
>> Sorry.  How do you enable overcommitment of memory, or do you mean to
>> say that your processes add up to more than the RAM you have?
>>
>
> The memory overcommitment is needed because in order to let java still
> "allocate" the memory for executing external bash commands like "du" when
> the RAM is nearly filled up. I have the swap turned off and have turned the
> overcommitment using sysctl and setting vm.overcommit_memory=0 (i.e. the
> option when any memory allocation attempt will succeed no matter the resting
> free RAM). I was encountering RS crashed caused by the "java.io.IOException:
> Cannot run program "bash": java.io.IOException: error=12, Cannot allocate
> memory". However, my processes should never add up more than the available
> RAM-the minimum for OS.
>
If it happens again, can I see stack trace for the above?


> How would these manifest? I guess that is not related but on the same note,
> I am encountering a quite high disk failure on machines running HBase/HDFS.
>


If all worked as designed, you'd not see anything.  A corrupted block
would be put aside and a new replica made from a good replica would
take its place.  But IIRC, corruption rate was really high on these
machines.  Do you ever run into files missing blocks?

Yeah, the disks were cheapies.

Any chance of different hardware?


> In general, the HDFS contains only HBase files, so at this point the memory
> consumption on NN is not an issue, so I have lowered that back to the
> defaults and will observe.
>

Yeah, this is probably better.  You are now like most others on this list.

> For the import I can understand, but when I am evaluating the querying
> performance, almost no writes (besides small statistics data) are going on
> and the HBase pauses as a whole, not only one RS (which I would believe is
> the case when writes were flushed in the statistics table having one
> region).
>

Lets try and dig in on this.  This shouldn't be happening.  Anything
in regionserver logs at the time of the pause?

St.Ack

Reply via email to