Increase the maximum amount of Java heap space you are giving the
loggers via $ACCUMULO_HOME/conf/accumulo-env.sh.
Likely, you have too large WALs on disk for your configured max heap.
Stop everything, making sure it all dies, then restart after you make
the change.
On 05/09/2013 03:41 PM, Slater, David M. wrote:
From the out log file, it looks like the logger processes are killing
themselves:
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
# Executing /bin/sh -c "kill -9 20759"...
I’m not doing any other concurrent Accumulo operations, though, so I
can’t figure out why I would be running out of heap space in the JVM.
Thoughts?
*From:*John Vines [mailto:[email protected]]
*Sent:* Thursday, May 09, 2013 3:08 PM
*To:* [email protected]
*Subject:* Re: warning: there are no loggers registered in zookeeper
Are your logger processes dying? Check their logs, including the out
and err files if they are.
On Thu, May 9, 2013 at 2:58 PM, Slater, David M.
<[email protected] <mailto:[email protected]>> wrote:
The recoveries were kicked off, but their copy/sort never got beyond
0%. I get the same string of warnings I got before. I then get "WARN:
Recovery of *.*.*.*:11224:6087aec0-c6e7-4473-84f3-8e78fb1eca5d failed"
for the data nodes.
Thoughts?
David
-----Original Message-----
From: Josh Elser [mailto:[email protected]
<mailto:[email protected]>]
Sent: Thursday, May 09, 2013 2:54 PM
To: [email protected] <mailto:[email protected]>
Subject: Re: warning: there are no loggers registered in zookeeper
Do a stop-all.sh. Make sure everything is actually down (pssh/pdsh and
use ps/jps to determine that all Accumulo processes are stopped). Run
start-all.sh again and see if you still have issues (likely you'll
have some WAL recoveries kick off).
If you do, check the logger and tserver log files to get the actual
problem.
On 05/09/2013 02:48 PM, Slater, David M. wrote:
>
> Hey everyone,
>
> After a bad shutdown, I'm getting the warning "there are no loggers
> registered in zookeeper", followed by "warning: unable to connect to
> ***: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused". This then leads to a
> number of timeout errors, "unable to get tablet server status". During
> the entire time I have no access to the tablet servers. And then it
> crashes.
>
> Is there a way to get the loggers back into a good state without
> destroying all of the tables?
>
> Thanks,
> David
>