RE: EXTERNAL: Re: Failing Tablet Servers

Cardon, Tejay E Fri, 21 Sep 2012 07:36:23 -0700

Gotcha.  So if I'm using java maps then my tserver_opts needs to be 
tserver.memory.maps + extra for the rest of the tserver because the memory map 
will be taken from the overall memory allocated to the tserver.  But if I'm 
using native maps, then I need far less tserver memory because the map memory 
is not deducted from the tserver.  Is that correct?

Thanks,
tejay

From: John Vines [mailto:[email protected]]
Sent: Friday, September 21, 2012 8:26 AM
To: [email protected]
Subject: Re: EXTERNAL: Re: Failing Tablet Servers

memory.maps is what defines the size of the in memory map. When using native 
maps, that space does not come out of the heap size. But when using non-native 
maps, it comes out of the heap space.
I think the issue Eric is trying to hit at is the fickleness of the java 
garbage collector. When you give a process that much heap, that's so much more 
data you can hold before you need to garbage collect. However, that also means 
when it does garbage collect, it's collecting a LOT more, which can result is 
poor performance.

John

On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E 
<[email protected]<mailto:[email protected]>> wrote:
Jim, Eric, and Adam,
Thanks.  It sounds like you're all saying the same thing.  Originally I was 
doing each key/value as its own mutation, and it was blowing up much faster 
(probably due to the volume/overhead of the mutation objects themselves.  I'll 
try refactoring to break them up into something in-between.  My keys are small 
(<25 Bytes), and my values are empty, but I'll aim for ~1,000 key/values per 
mutation and see how that works out for me.

Eric,
I was under the impression that the memory.maps setting was not very important 
when using native maps.  Apparently I'm mistaken there.  What does this setting 
control when in a native map setting?  And, in general, what's the proper 
balance between tserver_opts and tserver.memory.maps?

With regards to the "Finished gathering information from 24 servers in 27.45 
seconds"  Do you have any recommendations for how to chase down the bottleneck? 
 I'm pretty sure I'm having GC issues, but I'm not sure what is causing them on 
the server side.  I'm sending a fairly small number of very large mutation 
objects, which I'd expect to be a moderate problem for the GC, but not a huge 
one..

Thanks again to everyone for being so responsive and helpful.

Tejay Cardon

From: Eric Newton [mailto:[email protected]<mailto:[email protected]>]
Sent: Friday, September 21, 2012 8:03 AM

To: [email protected]<mailto:[email protected]>
Subject: EXTERNAL: Re: Failing Tablet Servers

A few items noted from your logs:

tserver.memory.maps.max = 1G

If you are giving your processes 10G, you might want to make the map larger, 
say 6G, and then reduce the JVM by 6G.

Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, 8000000 
entries created)

You are creating rows with 1M columns.  This is ok, but you might want to write 
them out more incrementally.

WARN : Running low on memory

That's pretty self-explanatory.  I'm guessing that the very large mutations are 
causing the tablet servers to run out of memory before they are held waiting 
for minor compactions.

Finished gathering information from 24 servers in 27.45 seconds

Something is running slow, probably due to GC thrashing.

WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]

And there's a server crashing, probably due to an OOM condition.

Send smaller mutations.  Maybe keep it to 200K column updates.  You can still 
have 1M wide rows, just send 5 mutations.

-Eric

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E 
<[email protected]<mailto:[email protected]>> wrote:
I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 
tablet servers on large dell servers with 30GB of memory each.  I've set the 
TServer_OPTS to give them each 10G of memory.  I'm running an ingest process 
that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each 
row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially 
runs quite quickly and I can see the ingest rate peak on the  monitor page.  
However, after about 30 seconds of high ingest, the ingest falls to 0.  It then 
stalls out and my map task are eventually killed.  In the end, the map/reduce 
fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  
The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've 
included a zip with the logs from one of the failed tservers and a second one 
with the logs from the master.  Other than the out of memory, I'm not seeing 
anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the 
process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

RE: EXTERNAL: Re: Failing Tablet Servers

Reply via email to