Can you check if this resolves the issue? http://drill.apache.org/docs/s3-storage-plugin/#quering-parquet-format-files-on-s3
-Abhishek On Saturday, April 30, 2016, Rob Canavan <[email protected]> wrote: > I'm trying to join two parquet files that I have stored in S3 and the query > keeps timing out: > > select * from aws_s3.`dim/market_header.parquet` a inner join > aws_s3.n`dim/market_program.parquet` b on a.market_no = b.market_no; > > I can run counts and aggs on the two tables fine: > > select count(*) from aws_s3.`dim/market_header.parquet`; > +---------+ > | EXPR$0 | > +---------+ > | 420 | > +---------+ > 1 row selected (0.984 seconds) > > > select count(*) from aws_s3.`dim/market_program.parquet`; > +----------+ > | EXPR$0 | > +----------+ > | 1035318 | > +----------+ > 1 row selected (0.738 seconds) > > select sum(cast(series_no as float)) from > aws_s3.`dim/market_program.parquet` as b limit 10; > +--------------------+ > | EXPR$0 | > +--------------------+ > | 2.072667694581E12 | > +--------------------+ > 1 row selected (1.63 seconds) > > > When I run the query to join them, after a few minutes I get: > > Error: SYSTEM ERROR: ConnectionPoolTimeoutException: Timeout waiting for > connection from pool > > Fragment 0:0 > > [Error Id: 45a6055c-08af-4ecd-b670-8dbcf196673f on ....... > amazonaws.com:31010] (state=,code=0) > > > This is a distributed setup with 4 drillbits. 16 core each with 64 GB > memory on each. My drill-env.sh has: > > DRILL_MAX_DIRECT_MEMORY="55G" > DRILL_HEAP="4G" > > > There's also a stacktrace in sqlline.log > > [Error Id: 45a6055c-08af-4ecd-b670-8dbcf196673f on . > compute-1.amazonaws.com:31010] > at > > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119) > [drill-java-exec-1.6.0.jar:1.6.0] > at > > org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113) > [drill-java-exec-1.6.0.jar:1.6.0] > at > > org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46) > [drill-rpc-1.6.0.jar:1.6.0] > at > > org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31) > [drill-rpc-1.6.0.jar:1.6.0] > at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374) > [drill-rpc-1.6.0.jar:1.6.0] > at > > org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252) > [drill-rpc-1.6.0.jar:1.6.0] > at > > org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257) > [drill-rpc-1.6.0.jar:1.6.0] > at > > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254) > [netty-handler-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] > > > I guess I'm not sure to even know where to start looking todebug this > issue, has anyone run into this problem before? > > > Thanks. > -- Abhishek Girish Senior Software Engineer (408) 476-9209 <http://www.mapr.com/>
