I found the message in tserver*.out. tserver*.err has 0 in it.
I posted last night, life was good, sat down this morning and saw that
another tserver had crashed, over night, with no activity. ?? In tserver*.out
it again says out of heap space.
ACCUMULO_TSERVER_OPTS=-Xmx2G -Xms1G. I would
You can use start-here.sh on the host in question or `start-server.sh
$hostname tserver`. FWIW, re-invoking start-all should just ignored the
hosts which already have processes running and just start a tserver on
the host that died.
2G should be enough to get a connector and read a table.
So start-here.sh does it. Thanks for pointing that out. I was looking all
through the shell commands .
I did try, from the master, start-all.sh and it worked for starting the
tserver, but I noticed that on the master, it increased the number of
processes labeled Main from the usual five to
Thanks Josh. But what do you mean my jstack'ing? I'm unfamiliar with
that term. A better question would be how can one troubleshoot such a
thing?
btw
I am the sole user on this cluster.
On Tue, Oct 7, 2014 at 4:18 PM, Josh Elser josh.el...@gmail.com wrote:
Ok, this record:
tcp0
Jstack is a tool which can be used to tell a java process to dump the
current stack traces for all of its threads. It's usually included with the
JDK. `kill -3 $pid` also does the same. If the output can't be respected
automatically to your shell, check the stdout for the process you gave as
an
Just for the record, I finally got to the bottom of things. One of my
Tservers was running out of memory. I hadn't noticed. I had my SA
allocate a lttle more--each node now has 6G up from 2G--and things are
working better.
On Oct 8, 2014 10:09 AM, Josh Elser josh.el...@gmail.com wrote:
Did you find the message in the tserver*.out, terver*.err or the monitor
page?
(Thanks for the follow-up message.)
On Wed, Oct 8, 2014 at 6:39 PM, Geoffry Roberts threadedb...@gmail.com
wrote:
Just for the record, I finally got to the bottom of things. One of my
Tservers was running out of
All,
Thanks for the responses.
Is this a problem for Accumulo?
Reverse DNS is yielding my ISP's host name. You know the drill, my IP in
reverse followed by their domain name, as opposed to my FQDN, which what I
use in my config files.
- Running Accumulo 1.5.1
- I have only one interface.
Can you provide the output from netstat, lsof or /proc/$pid/fd for the
tserver? Assuming you haven't altered tserv.port.client in
accumulo-site.xml, we want the line for port 9997.
From my laptop running a tserver on localhost:
$ netstat -na | grep 9997
tcp4 0 0 127.0.0.1.9997
...it's when
you make a Connector, and your client will talk to a tabletserver to
authenticate, that your program should hang. It would be good to
verify that.
My program should hang? Would you expand? That is exactly what it is
doing. I am able to get a connector. But when I try to iterate
Ok, this record:
tcp0 0 0.0.0.0:99970.0.0.0:*
LISTEN
Means that your is listening on the correct port on all interfaces.
There shouldn't be issues connecting to the tserver. This is also
confirmed by the fact that you authenticated and got a Connector (this
does
Hi Geoffry!
What version of Accumulo are you using?
Can you check your DNS on the cluster?
1) Does 'hostname' return the name you expect from the client? (the client
must be able to see all ZK servers and all tablet servers in the cluster)
2) Do your cluster config files contain the same host
If you add the following Log4j code before scanning, maybe the trace
messages from Accumulo client code will shed some light on whats happening.
Logger.getLogger(org.apache.accumulo.core.client).setLevel(Level.TRACE);
On Mon, Oct 6, 2014 at 5:26 PM, Geoffry Roberts threadedb...@gmail.com
wrote:
Accumulo tservers typically listen on a single interface. If you have a
server with multiple interfaces (e.g. loopback and eth0), you might have a
problem in which the tablet servers are not listening on externally
reachable interfaces. Tablet servers will list the interfaces that they are
14 matches
Mail list logo