Re: Thrift error while configuring new cloud

Keith Turner Thu, 17 Jan 2013 07:00:22 -0800

I think the default stack size is 1M, or it used to be.  Anyway 128M
is probably too large.   I think the your best option is to just
remove the -Xss option.




On Wed, Jan 16, 2013 at 4:43 PM, Hider, Sandy <[email protected]> wrote:
> Thanks Eric and John for the replies.  In case anyone else runs into this
> problem.
>
>
>
> On the slave nodes, in the accumulo log dir, in file
> tserver_<computer_name>.out it stated.
>
>
>
> The stack size specified is too small, Specify at least 160k
>
>
>
> I had accidentally put a k instead of an m on the tserver line in
> accumulo/conf/accumulo-env.sh
>
>
>
> test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY}
> -Xmx500m -Xms128m -Xss128k"
>
> instead of
>
> test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY}
> -Xmx500m -Xms128m -Xss128m"
>
>
>
> Thanks again,
>
>
>
> Sandy
>
>
>
>
>
>
>
> From: Eric Newton [mailto:[email protected]]
> Sent: Monday, January 07, 2013 4:57 PM
> To: [email protected]; vines
>
>
> Subject: Re: Thrift error while configuring new cloud
>
>
>
> Make sure the $ACCUMULO_LOG_DIR exists on all the nodes, too.
>
>
>
> -Eric
>
>
>
>
>
> On Mon, Jan 7, 2013 at 4:48 PM, John Vines <[email protected]> wrote:
>
> If you actually visit the monitor page, on port 50095, it does log
> aggregation so you don't have to dig up logs.
>
> It looks like your tserver and logger are not running. This is why things
> are not working for you. There may be log info on the monitor page. But you
> may have to go to your logs directory and find the logs for them. There may
> only be a .out and .err file for them, which should describe the error.
>
>
>
> On Mon, Jan 7, 2013 at 4:34 PM, Hider, Sandy <[email protected]> wrote:
>
> In monitor_<servername>.log
>
>
>
> 04 16:56:46,085 [server.Accumulo] INFO :  tserver.bloom.load.concurrent.max
> = 4
>
> 04 16:56:46,085 [server.Accumulo] INFO :  tserver.bulk.assign.threads = 1
>
> 04 16:56:46,085 [server.Accumulo] INFO :  tserver.bulk.process.threads = 1
>
> 04 16:56:46,085 [server.Accumulo] INFO :  tserver.bulk.retry.max = 3
>
> 04 16:56:46,085 [server.Accumulo] INFO :  tserver.cache.data.size = 50M
>
> 04 16:56:46,086 [server.Accumulo] INFO :  tserver.cache.index.size = 512M
>
> 04 16:56:46,086 [server.Accumulo] INFO :  tserver.client.timeout = 3s
>
> 04 16:56:46,086 [server.Accumulo] INFO :
> tserver.compaction.major.concurrent.max = 3
>
> 04 16:56:46,088 [server.Accumulo] INFO :  tserver.compaction.major.delay =
> 30s
>
> 04 16:56:46,088 [server.Accumulo] INFO :
> tserver.compaction.major.thread.files.open.max = 10
>
> 04 16:56:46,089 [server.Accumulo] INFO :
> tserver.compaction.minor.concurrent.max = 4
>
> 04 16:56:46,089 [server.Accumulo] INFO :  tserver.default.blocksize = 1M
>
> 04 16:56:46,089 [server.Accumulo] INFO :  tserver.dir.memdump = /tmp
>
> 04 16:56:46,089 [server.Accumulo] INFO :  tserver.files.open.idle = 1m
>
> 04 16:56:46,089 [server.Accumulo] INFO :  tserver.hold.time.max = 5m
>
> 04 16:56:46,089 [server.Accumulo] INFO :  tserver.logger.count = 2
>
> 04 16:56:46,089 [server.Accumulo] INFO :  tserver.logger.strategy =
> org.apache.accumulo.server.tabletserver.log.RoundRobinLoggerStrategy
>
> 04 16:56:46,090 [server.Accumulo] INFO :  tserver.logger.timeout = 30s
>
> 04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.lock = false
>
> 04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.manager =
> org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager
>
> 04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.maps.max = 1G
>
> 04 16:56:46,090 [server.Accumulo] INFO :  tserver.memory.maps.native.enabled
> = true
>
> 04 16:56:46,090 [server.Accumulo] INFO :
> tserver.metadata.readahead.concurrent.max = 8
>
> 04 16:56:46,090 [server.Accumulo] INFO :  tserver.migrations.concurrent.max
> = 1
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.monitor.fs = true
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.mutation.queue.max = 256K
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.port.client = 9997
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.port.search = false
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.readahead.concurrent.max =
> 16
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.scan.files.open.max = 100
>
> 04 16:56:46,091 [server.Accumulo] INFO :  tserver.server.threadcheck.time =
> 1s
>
> 04 16:56:46,092 [server.Accumulo] INFO :  tserver.server.threads.minimum = 2
>
> 04 16:56:46,092 [server.Accumulo] INFO :  tserver.session.idle.max = 1m
>
> 04 16:56:46,092 [server.Accumulo] INFO :
> tserver.tablet.split.midpoint.files.max = 30
>
> 04 16:56:46,092 [server.Accumulo] INFO :  tserver.walog.max.size = 100M
>
> 04 16:56:47,059 [impl.ServerClient] WARN : tracer:atmsq-14-253 There are no
> tablet servers: check that zookeeper and accumulo are running.
>
> 04 16:56:49,318 [monitor.Monitor] INFO :  Failed to obtain problem reports
>
> java.lang.RuntimeException:
> org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
>
>         at
> org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:186)
>
>         at
> org.apache.accumulo.server.problems.ProblemReports$3.hasNext(ProblemReports.java:241)
>
>         at
> org.apache.accumulo.server.problems.ProblemReports.summarize(ProblemReports.java:299)
>
>         at
> org.apache.accumulo.server.monitor.Monitor.fetchData(Monitor.java:367)
>
>         at
> org.apache.accumulo.server.monitor.Monitor$2.run(Monitor.java:468)
>
>         at
> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>
>         at java.lang.Thread.run(Thread.java:722)
>
> Caused by:
> org.apache.accumulo.core.client.impl.ThriftScanner$ScanTimedOutException
>
>         at
> org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:241)
>
>         at
> org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:94)
>
>         at
> org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:176)
>
>         ... 6 more
>
>
>
> It then repeats the exception over and over.   I don’t see a tserver log
> file.  Is that inside another log?
>
>
>
> Java running processes:
>
> start.Main tracer --address atmsq-14-253
>
> start.Main monitor --address atmsq-14-253
>
> start.Main gc --address atmsq-14-253
>
> start.Main master --address atmsq-14-253
>
> hadoop.mapred.JobTracker
>
> hadoop.hdfs.server.namenode.SecondaryNameNode
>
> hadoop.hdfs.server.namenode.NameNode
>
> zookeeper.server.quorum.QuorumPeerMain
> /home/clouduser/apps/zookeeper/bin/../conf/zoo.cfg
>
>
>
> I appreciate you looking at this.
>
>
>
> Sandy
>
>
>
>
>
>
>
> From: John Vines [mailto:[email protected]]
> Sent: Monday, January 07, 2013 3:49 PM
> To: [email protected]
> Subject: Re: Thrift error while configuring new cloud
>
>
>
> First check the monitor and see if any errors are popping up from the
> tserver. If not, check the tserver log files, as well as out/err files, to
> make sure it's starting.
>
>
>
> On Mon, Jan 7, 2013 at 3:40 PM, Hider, Sandy <[email protected]> wrote:
>
> All,
>
> I am setting up a new cloud and I am getting the following error when
> running Accumulo.  From the Tracer log:
>
>
>
> 07 11:20:37,925 [impl.ServerClient] DEBUG: ClientService request failed
> null, retrying ...
>
> org.apache.thrift.transport.TTransportException: Failed to connect to a
> server
>
>         at
> org.apache.accumulo.core.client.impl.ThriftTransportPool.getAnyTransport(ThriftTransportPool.java:437)
>
>         at
> org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:145)
>
>         at
> org.apache.accumulo.core.client.impl.ServerClient.getConnection(ServerClient.java:123)
>
>        at
> org.apache.accumulo.core.client.impl.ServerClient.executeRaw(ServerClient.java:105)
>
>         at
> org.apache.accumulo.core.client.impl.ServerClient.execute(ServerClient.java:71)
>
>         at
> org.apache.accumulo.core.client.impl.ConnectorImpl.<init>(ConnectorImpl.java:75)
>
>         at
> org.apache.accumulo.server.client.HdfsZooInstance.getConnector(HdfsZooInstance.java:145)
>
>         at
> org.apache.accumulo.server.trace.TraceServer.<init>(TraceServer.java:152)
>
>         at
> org.apache.accumulo.server.trace.TraceServer.main(TraceServer.java:222)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>         at java.lang.reflect.Method.invoke(Method.java:601)
>
>         at org.apache.accumulo.start.Main$1.run(Main.java:89)
>
>         at java.lang.Thread.run(Thread.java:722)
>
>
>
> When I tried to go into the shell it says can’t find any tablet servers make
> sure zookeeper and accumulo are both running.  They both are.  My theory is
> this, the name node is dual NIC’d.  I am wondering if somehow I have the
> zookeeper software listening on the wrong NIC card?  I don’t see a way to
> configure which NIC it is on.  Anyone have any ideas?
>
>
>
> This is Accumulo-1.4.0, Hadoop 0.20.2, Zookeeper 3.4.3  (a set I have used
> successfully on other servers)
>
>
>
> Thanks in advance
>
>
>
> Sandy
>
>
>
>
>
>
>
>

Re: Thrift error while configuring new cloud

Reply via email to