Heh, 20G might be a bit excessive because the maximum size for a
WriteAheadLog file is still probably 1G.
For a single node, a few GB is probably good for
tserver.memory.maps.max. You can try try upping JVM heap (in
accumulo-env.sh with ACCUMULO_TSERVER_OPTS) to also make more heap
available to the tserver process as a whole (which is what the
tserver.cache.data.size and tserver.cache.index.size ultimately pull from).
On 6/18/14, 9:51 AM, Jianshi Huang wrote:
Oh, this memory size:
tserver.memory.maps.max
1G -> 20G (looks like this is an overkill, is it?)
tserver.cache.data.size
128M? -> 1024M
tserver.cache.index.size
128M? -> 1024M
On Thu, Jun 19, 2014 at 12:48 AM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:
Which memory size? :)
JVM max heap size? In-memory map size? Either (really, both) would
be good to increase.
If the situation is as Eric posited, it's very likely that
increasing the resources that Accumulo has available would make it
more responsive and able to keep up with your ingest load.
- Josh
On 6/18/14, 9:44 AM, Jianshi Huang wrote:
Does memory size important to reduce this sort of errors? I used the
default settings (1GB/per tserver), this might be too small.
I increased it to 20GB, and I saw no errors after that.
Jianshi
On Thu, Jun 19, 2014 at 12:39 AM, Eric Newton
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>__>
wrote:
This error is often a result of overwhelming your server
resources.
It basically says "an update came in that was so old, the
id used to
identify the sender has already aged off."
What is your expected ingest rate during the job? What sort of
resources does accumulo have?
On Wed, Jun 18, 2014 at 7:09 AM, Jianshi Huang
<[email protected] <mailto:[email protected]>
<mailto:jianshi.huang@gmail.__com
<mailto:[email protected]>>> wrote:
Here's the error message I got from the tserver_xxx.log
2014-06-18 01:06:06,816 [tserver.TabletServer] INFO :
Adding 1
logs for extent g;cust:2072821;cust:20700111 as alias 37
2014-06-18 01:06:16,286 [thrift.ProcessFunction] ERROR:
Internal
error processing applyUpdates
java.lang.RuntimeException: No Such SessionID
at
org.apache.accumulo.tserver.__TabletServer$__ThriftClientHandler.__applyUpdates(TabletServer.__java:1522)
at
sun.reflect.__GeneratedMethodAccessor4.__invoke(Unknown
Source)
at
sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:43)
at
java.lang.reflect.Method.__invoke(Method.java:606)
at
org.apache.accumulo.trace.__instrument.thrift.TraceWrap$1.__invoke(TraceWrap.java:63)
at
com.sun.proxy.$Proxy23.__applyUpdates(Unknown Source)
at
org.apache.accumulo.core.__tabletserver.thrift.__TabletClientService$Processor$__applyUpdates.getResult(__TabletClientService.java:2347\
)
at
org.apache.accumulo.core.__tabletserver.thrift.__TabletClientService$Processor$__applyUpdates.getResult(__TabletClientService.java:2333\
)
at
org.apache.thrift.__ProcessFunction.process(__ProcessFunction.java:39)
at
org.apache.thrift.__TBaseProcessor.process(__TBaseProcessor.java:39)
at
org.apache.accumulo.server.__util.TServerUtils$__TimedProcessor.process(__TServerUtils.java:171)
at
org.apache.thrift.server.__AbstractNonblockingServer$__FrameBuffer.invoke(__AbstractNonblockingServer.__java:478)
at
org.apache.accumulo.server.__util.TServerUtils$THsHaServer$__Invocation.run(TServerUtils.__java:231)
at
java.util.concurrent.__ThreadPoolExecutor.runWorker(__ThreadPoolExecutor.java:1145)
at
java.util.concurrent.__ThreadPoolExecutor$Worker.run(__ThreadPoolExecutor.java:615)
at
org.apache.accumulo.trace.__instrument.TraceRunnable.run(__TraceRunnable.java:47)
at
org.apache.accumulo.core.util.__LoggingRunnable.run(__LoggingRunnable.java:34)
at java.lang.Thread.run(Thread.__java:724)
2014-06-18 01:06:16,287 [thrift.ProcessFunction] ERROR:
Internal
error processing applyUpdates
java.lang.RuntimeException: No Such SessionID
at
org.apache.accumulo.tserver.__TabletServer$__ThriftClientHandler.__applyUpdates(TabletServer.__java:1522)
at
sun.reflect.__GeneratedMethodAccessor4.__invoke(Unknown
Source)
at
sun.reflect.__DelegatingMethodAccessorImpl.__invoke(__DelegatingMethodAccessorImpl.__java:43)
at
java.lang.reflect.Method.__invoke(Method.java:606)
at
org.apache.accumulo.trace.__instrument.thrift.TraceWrap$1.__invoke(TraceWrap.java:63)
at
com.sun.proxy.$Proxy23.__applyUpdates(Unknown Source)
at
org.apache.accumulo.core.__tabletserver.thrift.__TabletClientService$Processor$__applyUpdates.getResult(__TabletClientService.java:2347\
)
at
org.apache.accumulo.core.__tabletserver.thrift.__TabletClientService$Processor$__applyUpdates.getResult(__TabletClientService.java:2333\
)
at
org.apache.thrift.__ProcessFunction.process(__ProcessFunction.java:39)
at
org.apache.thrift.__TBaseProcessor.process(__TBaseProcessor.java:39)
at
org.apache.accumulo.server.__util.TServerUtils$__TimedProcessor.process(__TServerUtils.java:171)
at
org.apache.thrift.server.__AbstractNonblockingServer$__FrameBuffer.invoke(__AbstractNonblockingServer.__java:478)
at
org.apache.accumulo.server.__util.TServerUtils$THsHaServer$__Invocation.run(TServerUtils.__java:231)
at
java.util.concurrent.__ThreadPoolExecutor.runWorker(__ThreadPoolExecutor.java:1145)
at
java.util.concurrent.__ThreadPoolExecutor$Worker.run(__ThreadPoolExecutor.java:615)
at
org.apache.accumulo.trace.__instrument.TraceRunnable.run(__TraceRunnable.java:47)
at
org.apache.accumulo.core.util.__LoggingRunnable.run(__LoggingRunnable.java:34)
at java.lang.Thread.run(Thread.__java:724)
2014-06-18 01:06:16,287
[util.TServerUtils$__THsHaServer] WARN :
Got an IOException during write!
java.io.IOException: Connection reset by peer
at
sun.nio.ch.FileDispatcherImpl.__write0(Native Method)
at
sun.nio.ch.SocketDispatcher.__write(SocketDispatcher.java:__47)
at
sun.nio.ch.IOUtil.__writeFromNativeBuffer(IOUtil.__java:93)
at sun.nio.ch.IOUtil.write(__IOUtil.java:65)
at
sun.nio.ch.SocketChannelImpl.__write(SocketChannelImpl.java:__487)
at
org.apache.thrift.transport.__TNonblockingSocket.write(__TNonblockingSocket.java:164)
at
org.apache.thrift.server.__AbstractNonblockingServer$__FrameBuffer.write(__AbstractNonblockingServer.__java:381)
at
org.apache.thrift.server.__AbstractNonblockingServer$__AbstractSelectThread.__handleWrite(__AbstractNonblockingServer.__java:220)
at
org.apache.thrift.server.__TNonblockingServer$__SelectAcceptThread.select(__TNonblockingServer.java:201)
Jianshi
On Wed, Jun 18, 2014 at 2:54 PM, Jianshi Huang
<[email protected]
<mailto:[email protected]>
<mailto:jianshi.huang@gmail.__com
<mailto:[email protected]>>> wrote:
I see. I'll check the tablet server log and paste
the error
message in later thread.
BTW, looks like the AccumuloOutputFormat is the
cause, I'm
currently using BatchWriter and it works well.
My code looks like this (it's in Scala as I'm using
Spark):
AccumuloOutputFormat.__setZooKeeperInstance(job,
Conf.getString("accumulo.__instance"),
Conf.getString("accumulo.__zookeeper.servers"))
AccumuloOutputFormat.__setConnectorInfo(job,
Conf.getString("accumulo.user"__), new
PasswordToken(Conf.getString("__accumulo.password")))
AccumuloOutputFormat.__setDefaultTableName(job,
Conf.getString("accumulo.__table"))
val paymentRDD: RDD[(Text, Mutation)] =
payment.flatMap
{ payment =>
// val key = new
Text(Conf.getString("accumulo.__table"))
paymentMutations(payment).map(__(null, _))
}
paymentRDD.__saveAsNewAPIHadoopFile(Conf.__getString("accumulo.instance")__,
classOf[Void], classOf[Mutation],
classOf[AccumuloOutputFormat], job.getConfiguration)
It's also possible that saveAsNewAPIHadoopFile
doesn't work
well with AccumuloOutputFormat.
Jianshi
On Wed, Jun 18, 2014 at 12:45 PM, Josh Elser
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:
Check the TabletServer logs. This Exception is
telling
you that there was an error on the server. You
should
look there for what the real problem was. You
can do
this one of two ways.
1) Use the "Recent Logs" page on the Accumulo
monitor
(http://accumulo_monitor_host:____50095
<http://accumulo_monitor_host:__50095
<http://accumulo_monitor_host:50095>>). Unless you
cleared the logs, or restarted the monitor
process since
you got this error, you should be able to see a
nice
HTML view of any errors
2) Check the debug log, e.g.
$ACCUMULO_HOME/logs/tserver_$____host.debug.log. If you're
running tservers on more than one node, be sure
that you
check the log files on all nodes.
- Josh
On 6/17/14, 9:33 PM, Jianshi Huang wrote:
Hi,
I got the following errors during MapReduce
ingestion, are they serious
errors?
java.io.IOException:
org.apache.accumulo.core.____client.____MutationsRejectedException:
# constraint
violations : 0 security codes: {} #
server\ errors
1 # exceptions 0
at
org.apache.accumulo.core.____client.mapreduce.____AccumuloOutputFormat$____AccumuloRecordWriter.write(____AccumuloOutputFormat.java:437)
at
org.apache.accumulo.core.____client.mapreduce.____AccumuloOutputFormat$____AccumuloRecordWriter.write(____AccumuloOutputFormat.java:373)
at
org.apache.spark.rdd.____PairRDDFunctions.org
<http://org.apache.spark.rdd.__PairRDDFunctions.org
<http://org.apache.spark.rdd.PairRDDFunctions.org>>
<http://org.apache.spark.rdd.____PairRDDFunctions.org
<http://org.apache.spark.rdd.__PairRDDFunctions.org
<http://org.apache.spark.rdd.PairRDDFunctions.org>>>$apache$____spark$rdd$PairRDDFunctions$$____writeShard$1(__PairRDDFunctions.__scala:716)
at
org.apache.spark.rdd.____PairRDDFunctions$$anonfun$____saveAsNewAPIHadoopDataset$1.____apply(PairRDDFunctions.scala:____730)
at
org.apache.spark.rdd.____PairRDDFunctions$$anonfun$____saveAsNewAPIHadoopDataset$1.____apply(PairRDDFunctions.scala:____730)
at
org.apache.spark.scheduler.____ResultTask.runTask(ResultTask.____scala:111)
at
org.apache.spark.scheduler.____Task.run(Task.scala:51)
at
org.apache.spark.executor.____Executor$TaskRunner.run(____Executor.scala:187)
at
java.util.concurrent.____ThreadPoolExecutor.runWorker(____ThreadPoolExecutor.java:1145)
at
java.util.concurrent.____ThreadPoolExecutor$Worker.run(____ThreadPoolExecutor.java:615)
at
java.lang.Thread.run(Thread.____java:724)
And
java.io.IOException:
org.apache.accumulo.core.____client.AccumuloException:
org.apache.thrift.____TApplicationException: Internal
error processing\
applyUpdates
at
org.apache.accumulo.core.____client.mapreduce.____AccumuloOutputFormat.____getRecordWriter(____AccumuloOutputFormat.java:558)
at
org.apache.spark.rdd.____PairRDDFunctions.org
<http://org.apache.spark.rdd.__PairRDDFunctions.org
<http://org.apache.spark.rdd.PairRDDFunctions.org>>
<http://org.apache.spark.rdd.____PairRDDFunctions.org
<http://org.apache.spark.rdd.__PairRDDFunctions.org
<http://org.apache.spark.rdd.PairRDDFunctions.org>>>$apache$____spark$rdd$PairRDDFunctions$$____writeShard$1(__PairRDDFunctions.__scala:712)
at
org.apache.spark.rdd.____PairRDDFunctions$$anonfun$____saveAsNewAPIHadoopDataset$1.____apply(PairRDDFunctions.scala:____730)
at
org.apache.spark.rdd.____PairRDDFunctions$$anonfun$____saveAsNewAPIHadoopDataset$1.____apply(PairRDDFunctions.scala:____730)
at
org.apache.spark.scheduler.____ResultTask.runTask(ResultTask.____scala:111)
at
org.apache.spark.scheduler.____Task.run(Task.scala:51)
at
org.apache.spark.executor.____Executor$TaskRunner.run(____Executor.scala:187)
at
java.util.concurrent.____ThreadPoolExecutor.runWorker(____ThreadPoolExecutor.java:1145)
at
java.util.concurrent.____ThreadPoolExecutor$Worker.run(____ThreadPoolExecutor.java:615)
at
java.lang.Thread.run(Thread.____java:724)
Cheers,
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/