RE: Thrift error while configuring new cloud

Hider, Sandy Wed, 16 Jan 2013 13:44:04 -0800

Thanks Eric and John for the replies.  In case anyone else runs into this 
problem.


On the slave nodes, in the accumulo log dir, in file 
tserver_<computer_name>.out it stated.

The stack size specified is too small, Specify at least 160k

I had accidentally put a k instead of an m on the tserver line in 
accumulo/conf/accumulo-env.sh

test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY} 
-Xmx500m -Xms128m -Xss128k"
instead of
test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY} 
-Xmx500m -Xms128m -Xss128m"

Thanks again,

Sandy



From: Eric Newton [mailto:[email protected]]
Sent: Monday, January 07, 2013 4:57 PM
To: [email protected]; vines
Subject: Re: Thrift error while configuring new cloud

Make sure the $ACCUMULO_LOG_DIR exists on all the nodes, too.

-Eric


On Mon, Jan 7, 2013 at 4:48 PM, John Vines 
<[email protected]<mailto:[email protected]>> wrote:
If you actually visit the monitor page, on port 50095, it does log aggregation 
so you don't have to dig up logs.
It looks like your tserver and logger are not running. This is why things are 
not working for you. There may be log info on the monitor page. But you may 
have to go to your logs directory and find the logs for them. There may only be 
a .out and .err file for them, which should describe the error.

On Mon, Jan 7, 2013 at 4:34 PM, Hider, Sandy 
<[email protected]<mailto:[email protected]>> wrote:
In monitor_<servername>.log

04 16:56:46,085 [server.Accumulo] INFO :  tserver.bloom.load.concurrent.max = 4
04 16:56:46,085 [server.Accumulo] INFO :  tserver.bulk.assign.threads = 1
04 16:56:46,085 [server.Accumulo] INFO :  tserver.bulk.process.threads = 1
04 16:56:46,085 [server.Accumulo] INFO :  tserver.bulk.retry.max = 3
04 16:56:46,085 [server.Accumulo] INFO :  tserver.cache.data.size = 50M
04 16:56:46,086 [server.Accumulo] INFO :  tserver.cache.index.size = 512M
04 16:56:46,086 [server.Accumulo] INFO :  tserver.client.timeout = 3s
04 16:56:46,086 [server.Accumulo] INFO :  
tserver.compaction.major.concurrent.max = 3
04 16:56:46,088 [server.Accumulo] INFO :  tserver.compaction.major.delay = 30s
04 16:56:46,088 [server.Accumulo] INFO :  
tserver.compaction.major.thread.files.open.max = 10
04 16:56:46,089 [server.Accumulo] INFO :  
tserver.compaction.minor.concurrent.max = 4
04 16:56:46,089 [server.Accumulo] INFO :  tserver.default.blocksize = 1M
04 16:56:46,089 [server.Accumulo] INFO :  tserver.dir.memdump = /tmp
04 16:56:46,089 [server.Accumulo] INFO :  tserver.files.open.idle = 1m
04 16:56:46,089 [server.Accumulo] INFO :  tserver.hold.time.max = 5m
04 16:56:46,089 [server.Accumulo] INFO :  tserver.logger.count = 2
04 16:56:46,089 [server.Accumulo] INFO :  tserver.logger.strategy = 
org.apache.accumulo.server.tabletserver.log.RoundRobinLoggerStrategy
04 16:56:46,090 [server.Accumulo] INFO :  tserver.logger.timeout = 30s
04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.lock = false
04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.manager = 
org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager
04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.maps.max = 1G
04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.maps.native.enabled = 
true
04 16:56:46,090 [server.Accumulo] INFO :  
tserver.metadata.readahead.concurrent.max = 8
04 16:56:46,090 [server.Accumulo] INFO :  tserver.migrations.concurrent.max = 1
04 16:56:46,091 [server.Accumulo] INFO :  tserver.monitor.fs = true
04 16:56:46,091 [server.Accumulo] INFO :  tserver.mutation.queue.max = 256K
04 16:56:46,091 [server.Accumulo] INFO :  tserver.port.client = 9997
04 16:56:46,091 [server.Accumulo] INFO :  tserver.port.search = false
04 16:56:46,091 [server.Accumulo] INFO :  tserver.readahead.concurrent.max = 16
04 16:56:46,091 [server.Accumulo] INFO :  tserver.scan.files.open.max = 100
04 16:56:46,091 [server.Accumulo] INFO :  tserver.server.threadcheck.time = 1s
04 16:56:46,092 [server.Accumulo] INFO :  tserver.server.threads.minimum = 2
04 16:56:46,092 [server.Accumulo] INFO :  tserver.session.idle.max = 1m
04 16:56:46,092 [server.Accumulo] INFO :  
tserver.tablet.split.midpoint.files.max = 30
04 16:56:46,092 [server.Accumulo] INFO :  tserver.walog.max.size = 100M
04 16:56:47,059 [impl.ServerClient] WARN : tracer:atmsq-14-253 There are no 
tablet servers: check that zookeeper and accumulo are running.
04 16:56:49,318 [monitor.Monitor] INFO :  Failed to obtain problem reports
java.lang.RuntimeException: 
org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at 
org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:186)
        at 
org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:241)
        at 
org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:299)
        at 
org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:367)
        at org.apache.accumulo.server.monitor.Monitor$2.run(Monitor.java:468)
        at 
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:722)
Caused by: 
org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
        at 
org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:241)
        at 
org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:94)
        at 
org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:176)
        ... 6 more

It then repeats the exception over and over.   I don't see a tserver log file.  
Is that inside another log?

Java running processes:
start.Main tracer --address atmsq-14-253
start.Main monitor --address atmsq-14-253
start.Main gc --address atmsq-14-253
start.Main master --address atmsq-14-253
hadoop.mapred.JobTracker
hadoop.hdfs.server.namenode.SecondaryNameNode
hadoop.hdfs.server.namenode.NameNode
zookeeper.server.quorum.QuorumPeerMain 
/home/clouduser/apps/zookeeper/bin/../conf/zoo.cfg

I appreciate you looking at this.

Sandy



From: John Vines [mailto:[email protected]<mailto:[email protected]>]
Sent: Monday, January 07, 2013 3:49 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: Thrift error while configuring new cloud

First check the monitor and see if any errors are popping up from the tserver. 
If not, check the tserver log files, as well as out/err files, to make sure 
it's starting.

On Mon, Jan 7, 2013 at 3:40 PM, Hider, Sandy 
<[email protected]<mailto:[email protected]>> wrote:
All,
I am setting up a new cloud and I am getting the following error when running 
Accumulo.  From the Tracer log:

07 11:20:37,925 [impl.ServerClient] DEBUG: ClientService request failed null, 
retrying ...
org.apache.thrift.transport.TTransportException: Failed to connect to a server
        at 
org.apache.accumulo.core.client.impl.ThriftTransportPool.getAnyTransport(ThriftTransportPool.java:437)
        at 
org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:145)
        at 
org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:123)
       at 
org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:105)
        at 
org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71)
        at 
org.apache.accumulo.core.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:75)
        at 
org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:145)
        at 
org.apache.accumulo.server.trace.TraceServer.<init>(TraceServer.java:152)
        at 
org.apache.accumulo.server.trace.TraceServer.main(TraceServer.java:222)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.accumulo.start.Main$1.run(Main.java:89)
        at java.lang.Thread.run(Thread.java:722)

When I tried to go into the shell it says can't find any tablet servers make 
sure zookeeper and accumulo are both running.  They both are.  My theory is 
this, the name node is dual NIC'd.  I am wondering if somehow I have the 
zookeeper software listening on the wrong NIC card?  I don't see a way to 
configure which NIC it is on.  Anyone have any ideas?

This is Accumulo-1.4.0, Hadoop 0.20.2, Zookeeper 3.4.3  (a set I have used 
successfully on other servers)

Thanks in advance

Sandy

RE: Thrift error while configuring new cloud

Reply via email to