RE: Fwd: why compaction failure on one table brings other tables offline, how to recover

Jayesh Patel Tue, 12 Apr 2016 08:36:01 -0700

Well the total RAM on the VM is 6GB with no swap space, so the OS and other 
Accumulo processes have enough.  I meant that 300MB is currently available for 
tserver process to use as reported by 'free'.  tserver.sort.buffer.size is set 
to 100MB.  I was able to start it up today, some change in the dynamics I 
guess.

-----Original Message-----
From: Josh Elser [mailto:[email protected]]
Sent: Tuesday, April 12, 2016 11:11 AM
To: [email protected]
Subject: Re: Fwd: why compaction failure on one table brings other tables 
offline, how to recover

Jayesh Patel wrote:
> Josh, The OOM tserver process was killed by the kernel, it didn't hang
> around.  I tried restarting it manually, but it ran out of memory
> right away and was killed again leaving the tablet offline.  It must
> have a huge "recovery" log to go through.  HDFS
> /accumulo/wal/instance-accumulo+9997/24e08581-a081-4b41-afc5-d75bdda6c
> f15 is about 42MB, and machine has about 300MB free and apparently not
> enough for tserver.
>

Ok, cool. If you're that constrained on resources, you can also try reducing 
the property tserver.sort.buffer.size in accumulo-site.xml. It defaults to 
200M, you could try 25M or 50M instead.

This is a buffer size that is used for sorting log edits during the recovery 
process. This might help if you never make it through the recovery process.

300MB is a little low in general as far as headroom goes (especially when 
you're already not giving Accumulo enough RAM). Typically, you want to ensure 
that you give the operating system at least 1G of memory for itself.

smime.p7s
Description: S/MIME cryptographic signature

RE: Fwd: why compaction failure on one table brings other tables offline, how to recover

Reply via email to