If you only have 4G available, >=2G is probably a little excessive for
the OS :)
On 6/25/14, 3:30 PM, Sean Busbey wrote:
you can also calculate how much memory you need to have (or your cluster
management software can do it for you).
Things to factor:
OS needs (>= 2GB)
DataNode
TaskTracker (or NodeManager depending on MRv1 vs YARN)
task memory (child slots * per-child max under MRv1)
TServer Java Heap
TServer native map
Plus any other processes you regularly run on those nodes.
On Wed, Jun 25, 2014 at 2:07 PM, John Vines <[email protected]
<mailto:[email protected]>> wrote:
It's also possible that you're overscribing your memory on the
overall system between the tservers and the MR slots. Check yoru
syslogs and see if there's anything about killing java processes.
On Wed, Jun 25, 2014 at 3:05 PM, Jacob Rust <[email protected]
<mailto:[email protected]>> wrote:
I will play around with the memory settings some more, it sounds
like that is definitely it. Thanks everyone!
On Wed, Jun 25, 2014 at 2:55 PM, Josh Elser
<[email protected] <mailto:[email protected]>> wrote:
The lack of exception in the debug log makes it seem even
more likely that you just got an OOME.
It's a crap-shoot as to whether or not you'll actually get
the Exception printed in the log, but you should always get
it in the .out/.err files as previously mentioned.
On 6/25/14, 2:44 PM, Jacob Rust wrote:
Ah, here is the right log: http://pastebin.com/DLEzLGqN
I will double check which example. Thanks.
On Wed, Jun 25, 2014 at 2:38 PM, John Vines
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:
And you're certain your using the standalone
example and not the
native-standalone? Those expect the native
libraries to be extant
and if not will eventually cause an OOM.
On Wed, Jun 25, 2014 at 2:33 PM, Jacob Rust
<[email protected] <mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>>__> wrote:
Accumulo version 1.5.1.2.1.2.1-471
Hadoop version 2.4.0.2.1.2.1-471
<tel:2.4.0.2.1.2.1-471> <tel:2.4.0.2.1.2.1-471
<tel:2.4.0.2.1.2.1-471>>
tserver debug log http://pastebin.com/BHdTkxeK
I what you mean about the memory. I am using
the memory settings
from the example files
https://github.com/apache/__accumulo/tree/master/conf/__examples/512MB/standalone
<https://github.com/apache/accumulo/tree/master/conf/examples/512MB/standalone>.
I also ran into this problem using the 1GB
example memory
settings. Each node has 4GB RAM.
Thanks
On Wed, Jun 25, 2014 at 2:10 PM, Sean Busbey
<[email protected]
<mailto:[email protected]> <mailto:[email protected]
<mailto:[email protected]>>> wrote:
What version of Accumulo?
What version of Hadoop?
What does your server memory and per-role
allocation look like?
Can you paste the tserver debug log?
On Wed, Jun 25, 2014 at 1:01 PM, Jacob Rust
<[email protected]
<mailto:[email protected]>
<mailto:[email protected]
<mailto:[email protected]>>__> wrote:
I am trying to create an inverted text
index for a table
using accumulo input/output format in a
java
mapreduce program. When the job
reaches the reduce
phase and creates the table / tries to
write to it the
tablet servers begin to die.
Now when I do a start-all.sh the tablet
servers start
for about a minute and then die again.
Any idea as to
why the mapreduce job is killing the
tablet servers
and/or how to bring the tablet servers
back up without
failing?
This is on a 12 node cluster with low
quality hardware.
The java code I am running is here
http://pastebin.com/ti7Qz19m
The log files on each tablet server
only display the
startup information, no errors. The log
files on the
master server show these errors
http://pastebin.com/LymiTfB7
--
Jacob Rust
Software Intern
--
Sean
--
Jacob Rust
Software Intern
--
Jacob Rust
Software Intern
--
Jacob Rust
Software Intern
--
Sean