[ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227094#comment-15227094
 ] 

ASF GitHub Bot commented on DRILL-3714:
---------------------------------------

Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/442#discussion_r58611408
  
    --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RpcBus.java ---
    @@ -159,19 +159,15 @@ public ChannelClosedHandler(C clientConnection, 
Channel channel) {
         @Override
         public void operationComplete(ChannelFuture future) throws Exception {
           String msg;
    -      if(local!=null) {
    +      if(local != null) {
             msg = String.format("Channel closed %s <--> %s.", local, remote);
           }else{
             msg = String.format("Channel closed %s <--> %s.", 
future.channel().localAddress(), future.channel().remoteAddress());
           }
     
    -      if (RpcBus.this.isClient()) {
    -        if(local != null) {
    -          logger.info(String.format(msg));
    -        }
    -      } else {
    -        queue.channelClosed(new ChannelClosedException(msg));
    -      }
    +      logger.info(msg); // should we leave this at info level ?
    +
    +      queue.channelClosed(new ChannelClosedException(msg), 
future.channel());
     
           clientConnection.close();
    --- End diff --
    
    I believe this is still required because we could be in a backpressure 
blocking state and need to error out the pending messages. My patch tries to 
consolidate the connection close behavior (so we have one place to do work 
instead of two). Note that right now we add close handlers in both the 
connection and the rpc bus which is confusing/unclean.


> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> ------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3714
>                 URL: https://issues.apache.org/jira/browse/DRILL-3714
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.2.0
>            Reporter: Victoria Markman
>            Assignee: Jacques Nadeau
>            Priority: Critical
>             Fix For: 1.7.0
>
>         Attachments: Screen Shot 2015-08-26 at 10.36.33 AM.png, drillbit.log, 
> jstack.txt, query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
>
>
> This is a variation of DRILL-3705 with the difference of drill behavior when 
> hitting OOM condition.
> Query runs out of memory during execution and remains in 
> "CANCELLATION_REQUESTED" state until drillbit is bounced.
> Client (sqlline in this case) never gets a response from the server.
> Reproduction details:
>         Single node drillbit installation.
>         DRILL_MAX_DIRECT_MEMORY="8G"
>         DRILL_HEAP="4G"
> Run this query on TPCDS SF100 data set
> {code}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 
> LIMIT 10;
> {code}
> drillbit.log
> {code}
> 2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING
> 2015-08-26 16:55:50,498 [BitServer-5] WARN  
> o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 
> took longer than 500ms.  Actual duration was 2569ms.
> 2015-08-26 16:56:31,086 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
>         at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
>         at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
>         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) 
> ~[na:1.7.0_71]
>         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_71]
>         at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:600) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.buffer.UnsafeDirectLittleEndian.writeBytes(UnsafeDirectLittleEndian.java:28)
>  ~[drill-java-exec-1.2.0-SNAPSHOT.jar:4.0.27.Final]
>         at 
> io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:227)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         ... 11 common frames omitted
> 2015-08-26 16:56:31,087 [BitServer-5] INFO  
> o.a.d.exec.rpc.ProtobufLengthDecoder - Channel is closed, discarding 
> remaining 124958 byte(s) in buffer.
> 2015-08-26 16:56:31,087 [BitClient-1] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:54554 <--> /10.10.88.133:31012 (data client).  
> Closing connection.
> java.io.IOException: syscall:read(...)() failed: Connection reset by peer
> 2015-08-26 16:56:31,088 [BitClient-1] INFO  
> o.a.drill.exec.rpc.data.DataClient - Channel closed /10.10.88.133:54554 <--> 
> /10.10.88.133:31012.
> 2015-08-26 16:56:35,325 [BitServer-6] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:54555 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
>         at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> {code}
> Attached:
>         query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
>         drillbit.log
>         jstack.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to