Thanks Eric and John for the replies. In case anyone else runs into this
problem.
On the slave nodes, in the accumulo log dir, in file
tserver_<computer_name>.out it stated.
The stack size specified is too small, Specify at least 160k
I had accidentally put a k instead of an m on the tserver line in
accumulo/conf/accumulo-env.sh
test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY}
-Xmx500m -Xms128m -Xss128k"
instead of
test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY}
-Xmx500m -Xms128m -Xss128m"
Thanks again,
Sandy
From: Eric Newton [mailto:[email protected]]
Sent: Monday, January 07, 2013 4:57 PM
To: [email protected]; vines
Subject: Re: Thrift error while configuring new cloud
Make sure the $ACCUMULO_LOG_DIR exists on all the nodes, too.
-Eric
On Mon, Jan 7, 2013 at 4:48 PM, John Vines
<[email protected]<mailto:[email protected]>> wrote:
If you actually visit the monitor page, on port 50095, it does log aggregation
so you don't have to dig up logs.
It looks like your tserver and logger are not running. This is why things are
not working for you. There may be log info on the monitor page. But you may
have to go to your logs directory and find the logs for them. There may only be
a .out and .err file for them, which should describe the error.
On Mon, Jan 7, 2013 at 4:34 PM, Hider, Sandy
<[email protected]<mailto:[email protected]>> wrote:
In monitor_<servername>.log
04 16:56:46,085 [server.Accumulo] INFO : tserver.bloom.load.concurrent.max = 4
04 16:56:46,085 [server.Accumulo] INFO : tserver.bulk.assign.threads = 1
04 16:56:46,085 [server.Accumulo] INFO : tserver.bulk.process.threads = 1
04 16:56:46,085 [server.Accumulo] INFO : tserver.bulk.retry.max = 3
04 16:56:46,085 [server.Accumulo] INFO : tserver.cache.data.size = 50M
04 16:56:46,086 [server.Accumulo] INFO : tserver.cache.index.size = 512M
04 16:56:46,086 [server.Accumulo] INFO : tserver.client.timeout = 3s
04 16:56:46,086 [server.Accumulo] INFO :
tserver.compaction.major.concurrent.max = 3
04 16:56:46,088 [server.Accumulo] INFO : tserver.compaction.major.delay = 30s
04 16:56:46,088 [server.Accumulo] INFO :
tserver.compaction.major.thread.files.open.max = 10
04 16:56:46,089 [server.Accumulo] INFO :
tserver.compaction.minor.concurrent.max = 4
04 16:56:46,089 [server.Accumulo] INFO : tserver.default.blocksize = 1M
04 16:56:46,089 [server.Accumulo] INFO : tserver.dir.memdump = /tmp
04 16:56:46,089 [server.Accumulo] INFO : tserver.files.open.idle = 1m
04 16:56:46,089 [server.Accumulo] INFO : tserver.hold.time.max = 5m
04 16:56:46,089 [server.Accumulo] INFO : tserver.logger.count = 2
04 16:56:46,089 [server.Accumulo] INFO : tserver.logger.strategy =
org.apache.accumulo.server.tabletserver.log.RoundRobinLoggerStrategy
04 16:56:46,090 [server.Accumulo] INFO : tserver.logger.timeout = 30s
04 16:56:46,090 [server.Accumulo] INFO : tserver.memory.lock = false
04 16:56:46,090 [server.Accumulo] INFO : tserver.memory.manager =
org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager
04 16:56:46,090 [server.Accumulo] INFO : tserver.memory.maps.max = 1G
04 16:56:46,090 [server.Accumulo] INFO : tserver.memory.maps.native.enabled =
true
04 16:56:46,090 [server.Accumulo] INFO :
tserver.metadata.readahead.concurrent.max = 8
04 16:56:46,090 [server.Accumulo] INFO : tserver.migrations.concurrent.max = 1
04 16:56:46,091 [server.Accumulo] INFO : tserver.monitor.fs = true
04 16:56:46,091 [server.Accumulo] INFO : tserver.mutation.queue.max = 256K
04 16:56:46,091 [server.Accumulo] INFO : tserver.port.client = 9997
04 16:56:46,091 [server.Accumulo] INFO : tserver.port.search = false
04 16:56:46,091 [server.Accumulo] INFO : tserver.readahead.concurrent.max = 16
04 16:56:46,091 [server.Accumulo] INFO : tserver.scan.files.open.max = 100
04 16:56:46,091 [server.Accumulo] INFO : tserver.server.threadcheck.time = 1s
04 16:56:46,092 [server.Accumulo] INFO : tserver.server.threads.minimum = 2
04 16:56:46,092 [server.Accumulo] INFO : tserver.session.idle.max = 1m
04 16:56:46,092 [server.Accumulo] INFO :
tserver.tablet.split.midpoint.files.max = 30
04 16:56:46,092 [server.Accumulo] INFO : tserver.walog.max.size = 100M
04 16:56:47,059 [impl.ServerClient] WARN : tracer:atmsq-14-253 There are no
tablet servers: check that zookeeper and accumulo are running.
04 16:56:49,318 [monitor.Monitor] INFO : Failed to obtain problem reports
java.lang.RuntimeException:
org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
at
org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:186)
at
org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:241)
at
org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:299)
at
org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:367)
at org.apache.accumulo.server.monitor.Monitor$2.run(Monitor.java:468)
at
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
at java.lang.Thread.run(Thread.java:722)
Caused by:
org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
at
org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:241)
at
org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:94)
at
org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:176)
... 6 more
It then repeats the exception over and over. I don't see a tserver log file.
Is that inside another log?
Java running processes:
start.Main tracer --address atmsq-14-253
start.Main monitor --address atmsq-14-253
start.Main gc --address atmsq-14-253
start.Main master --address atmsq-14-253
hadoop.mapred.JobTracker
hadoop.hdfs.server.namenode.SecondaryNameNode
hadoop.hdfs.server.namenode.NameNode
zookeeper.server.quorum.QuorumPeerMain
/home/clouduser/apps/zookeeper/bin/../conf/zoo.cfg
I appreciate you looking at this.
Sandy
From: John Vines [mailto:[email protected]<mailto:[email protected]>]
Sent: Monday, January 07, 2013 3:49 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Thrift error while configuring new cloud
First check the monitor and see if any errors are popping up from the tserver.
If not, check the tserver log files, as well as out/err files, to make sure
it's starting.
On Mon, Jan 7, 2013 at 3:40 PM, Hider, Sandy
<[email protected]<mailto:[email protected]>> wrote:
All,
I am setting up a new cloud and I am getting the following error when running
Accumulo. From the Tracer log:
07 11:20:37,925 [impl.ServerClient] DEBUG: ClientService request failed null,
retrying ...
org.apache.thrift.transport.TTransportException: Failed to connect to a server
at
org.apache.accumulo.core.client.impl.ThriftTransportPool.getAnyTransport(ThriftTransportPool.java:437)
at
org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:145)
at
org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:123)
at
org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:105)
at
org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71)
at
org.apache.accumulo.core.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:75)
at
org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:145)
at
org.apache.accumulo.server.trace.TraceServer.<init>(TraceServer.java:152)
at
org.apache.accumulo.server.trace.TraceServer.main(TraceServer.java:222)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.accumulo.start.Main$1.run(Main.java:89)
at java.lang.Thread.run(Thread.java:722)
When I tried to go into the shell it says can't find any tablet servers make
sure zookeeper and accumulo are both running. They both are. My theory is
this, the name node is dual NIC'd. I am wondering if somehow I have the
zookeeper software listening on the wrong NIC card? I don't see a way to
configure which NIC it is on. Anyone have any ideas?
This is Accumulo-1.4.0, Hadoop 0.20.2, Zookeeper 3.4.3 (a set I have used
successfully on other servers)
Thanks in advance
Sandy