All:

Finally, got some time to look into this further. I enabled debug logging on accumulo.core.client. I really only got one interesting new error which is java.nio.channels.ClosedByInterruptException - see below:

2017-07-10 13:34:56,106 | DEBUG | [main] | (AccumuloPersistor.java:1556) - Feature Update Time: 594, Updated: true 2017-07-10 13:34:56,118 | DEBUG | [main] | (AccumuloPersistor.java:1441) - Linkage Update Time: 12 Updated: true 2017-07-10 13:34:56,120 | WARN | [batch scanner 181- 1 looking up 1 ranges at bdpnode6.bdpdev.incadencecorp.com:9997] | (TIOStreamTransport.java:112) - Error closing output stream.
java.io.IOException: The stream is closed
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:118) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
        at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110) at org.apache.thrift.transport.TFramedTransport.close(TFramedTransport.java:89) at org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.close(ThriftTransportPool.java:309) at org.apache.accumulo.core.client.impl.ThriftTransportPool.returnTransport(ThriftTransportPool.java:571) at org.apache.accumulo.core.rpc.ThriftUtil.returnClient(ThriftUtil.java:151) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:686) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349) at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:748)
2017-07-10 13:34:56,121 | DEBUG | [batch scanner 181- 1 looking up 1 ranges at bdpnode6.bdpdev.incadencecorp.com:9997] | (TabletServerBatchReaderIterator.java:689) - Server : bdpnode6.bdpdev.incadencecorp.com:9997 msg : java.nio.channels.ClosedByInterruptException 2017-07-10 13:34:56,121 | DEBUG | [batch scanner 181- 1 looking up 1 ranges at bdpnode6.bdpdev.incadencecorp.com:9997] | (TabletServerBatchReaderIterator.java:366) - org.apache.thrift.transport.TTransportException: java.nio.channels.ClosedByInterruptException java.io.IOException: org.apache.thrift.transport.TTransportException: java.nio.channels.ClosedByInterruptException at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:691) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349) at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException: java.nio.channels.ClosedByInterruptException at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:158) at org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.flush(ThriftTransportPool.java:320) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.send_closeMultiScan(TabletClientService.java:365) at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.closeMultiScan(TabletClientService.java:356) at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:683)
        ... 6 more
Caused by: java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:478)
at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:159)
        ... 12 more
2017-07-10 13:34:56,140 | DEBUG | [main] | (AccumuloPersistor.java:1441) - Linkage Update Time: 22 Updated: true 2017-07-10 13:34:56,152 | DEBUG | [main] | (AccumuloPersistor.java:1441) - Linkage Update Time: 11 Updated: true 2017-07-10 13:34:56,152 | DEBUG | [batch scanner 183- 1 looking up 1 ranges at bdpnode8.bdpdev.incadencecorp.com:9997] | (TabletServerBatchReaderIterator.java:689) - Server : bdpnode8.bdpdev.incadencecorp.com:9997 msg : java.nio.channels.ClosedByInterruptException
I attempted enabling TRACE level output but got a divide/0 exception in one of the TRACE calls (bug was fixed in 1.7.3)

As close as I can tell from the logs and my code, the failure is happening after I have done all my updates.
No exceptions seem to percolate back to my code.




On 4/19/17 2:56 PM, Keith Turner wrote:
On Wed, Apr 19, 2017 at 12:38 PM, David Boyd <db...@incadencecorp.com> wrote:
All:

    I am getting this stack trace periodically based on no pattern I can
determine from my application.
If you are seeing this regularly, try to enable Accumulo debug
logging.  That may allow you see messages like [1] which may help
understand the cause.

https://github.com/apache/accumulo/blob/rel/1.7.2/core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchReaderIterator.java#L689

Is this a message I should be worried about?

What is a technique to trace this back to my code and the cause?

Obviously something is closing things before the thread closes it.


2017-04-19 12:33:53,423 |  WARN | [batch scanner 19824- 8 looking up 1
ranges at accumulodev:9997] | (TIOStreamTransport.java:112) - Error closing
output stream.
java.io.IOException: The stream is closed
     at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:118)
     at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
     at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
     at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
     at
org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110)
     at
org.apache.thrift.transport.TFramedTransport.close(TFramedTransport.java:89)
     at
org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.close(ThriftTransportPool.java:309)
     at
org.apache.accumulo.core.client.impl.ThriftTransportPool.returnTransport(ThriftTransportPool.java:571)
     at
org.apache.accumulo.core.rpc.ThriftUtil.returnClient(ThriftUtil.java:151)
     at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:686)
     at
org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
     at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
     at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
     at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
     at
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
     at java.lang.Thread.run(Thread.java:745)


--
========= mailto:db...@incadencecorp.com ============
David W. Boyd
VP,  Data Solutions
10432 Balls Ford, Suite 240
Manassas, VA 20109
office:   +1-703-552-2862
cell:     +1-703-402-7908
============== http://www.incadencecorp.com/ ============
ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
Chair ANSI/INCITS TC Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - www.iliterobotics.org
Board Member- USSTEM Foundation - www.usstem.org

The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.



--
========= mailto:db...@incadencecorp.com ============
David W. Boyd
VP,  Data Solutions
10432 Balls Ford, Suite 240
Manassas, VA 20109
office:   +1-703-552-2862
cell:     +1-703-402-7908
============== http://www.incadencecorp.com/ ============
ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
Chair ANSI/INCITS TC Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - www.iliterobotics.org
Board Member- USSTEM Foundation - www.usstem.org

The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.

Reply via email to