Hi,

  We have been using Accumulo 1.5.0 for a little while now and bugging Eric.
  Moving the discussion to public mailing list hopefully will benefit others.

  Some background:

  1. We are running 100 tablet servers with 3 zookeepers
  2. About 7 tables with 203M entries (so far)
  3. We are running these on Amazon EC2

  Noticed one of the tablet servers being unresponsive. Here are the exceptions
  in the log on the tablet server:

  A. 2014-03-06 11:02:38,701 [file.BloomFilterLayer] ERROR: Thread 
"bloom-loader 3194" died File /accumulo/tables/8/t-00001tv/F0003cul.rf is closed
java.lang.IllegalStateException: File /accumulo/tables/8/t-00001tv/F0003cul.rf 
is closed
        at 
org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:251)
        at 
org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$000(CachableBlockFile.java:143)
        at 
org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:212)
        at 
org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
        at 
org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:367)
        at 
org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:143)
        at 
org.apache.accumulo.core.file.rfile.RFile$Reader.getMetaStore(RFile.java:964)
        at 
org.apache.accumulo.core.file.BloomFilterLayer$BloomFilterLoader$1.run(BloomFilterLayer.java:198)
        at 
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at 
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:42)
        at 
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:744)

 B. 2014-03-06 09:18:08,165 [util.TServerUtils$THsHaServer] WARN : Got an 
IOException in internalRead!
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at 
org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
        at 
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:515)
        at 
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:305)
        at 
org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:202)
        at 
org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.select(TNonblockingServer.java:198)
        at 
org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.run(TNonblockingServer.java:154)
2014-03-06 09:18:08,165 [util.TServerUtils$THsHaServer] WARN : Got an 
IOException in internalRead!

  (B) is multiple times over the last few days.

  Any insight whether we should be be concerned about these and what the fix
  should be much appreciated.

  regards

Amit

Reply via email to