can you share a specific query that consistently times out ? what kind of data are you querying ?
are you running Drill in embedded mode or do you have a Drill cluster ? in case of a cluster, what is the size and number of cores of your cluster ? what version of Drill are you running ? Thanks On Tue, Apr 5, 2016 at 7:59 AM, COUERON Damien (i-BP - MICROPOLE) < [email protected]> wrote: > Despite the log below, what kind of details are you interested in ? > > > > -----Message d'origine----- > De : Abdel Hakim Deneche [mailto:[email protected]] > Envoyé : dimanche 3 avril 2016 07:37 > À : user > Objet : Re: How to modify connection timeout delay ? > > Hi Damien, > > Like Jason said, we have a heartbeat mechanism that should've prevented > this issue all together, so I'm interested to learn how this is happening. > We've seen this happen many times but so far we were never able to > reproduce it. > > Could you give us more details so we can reproduce the issue ? > > Thanks > > On Thu, Mar 31, 2016 at 2:47 PM, COUERON Damien (i-BP - MICROPOLE) < > [email protected]> wrote: > > > Hi Jason, > > > > Thanks for your help. I have set this parameter to 0 on every drillbit > > and it works like a charm now. > > > > Regarding your questions, there was no particular query that triggered > > this issue. Every query longer than 30 seconds was impacted. > > Please find below the log messages I received : > > > > 2016-03-24 14:18:31,368 [290c16d8-4244-d664-4562-5b156f3e6fff:foreman] > > INFO o.a.drill.exec.work.foreman.Foreman - Query text for query id > > 290c16d8-4244-d664-4562-5b156f3e6fff: select count(columns[1]) from > > hdfs.lemo.mails > > 2016-03-24 14:18:31,417 [290c16d8-4244-d664-4562-5b156f3e6fff:foreman] > > INFO o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 > > out of 1 using 1 threads. Time: 2ms total, 2.056748ms avg, 2ms max. > > 2016-03-24 14:18:31,417 [290c16d8-4244-d664-4562-5b156f3e6fff:foreman] > > INFO o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 > > out of 1 using 1 threads. Earliest start: 1.391000 μs, Latest start: > > 1.391000 μs, Average start: 1.391000 μs . > > 2016-03-24 14:18:31,466 > > [290c16d8-4244-d664-4562-5b156f3e6fff:frag:0:0] > > INFO o.a.d.e.w.fragment.FragmentExecutor - > > 290c16d8-4244-d664-4562-5b156f3e6fff:0:0: State change requested > > AWAITING_ALLOCATION --> RUNNING > > 2016-03-24 14:18:31,466 > > [290c16d8-4244-d664-4562-5b156f3e6fff:frag:0:0] > > INFO o.a.d.e.w.f.FragmentStatusReporter - > > 290c16d8-4244-d664-4562-5b156f3e6fff:0:0: State to report: RUNNING > > 2016-03-24 14:19:01,570 [UserServer-1] INFO > > o.a.drill.exec.rpc.user.UserServer - RPC connection /39.6.64.20:31010 > > <--> /39.6.64.22:53976 (user client) timed out. Timeout was set to 30 > > seconds. Closing connection. > > 2016-03-24 14:19:01,579 [CONTROL-rpc-event-queue] INFO > > o.a.d.e.w.fragment.FragmentExecutor - > > 290c16d8-4244-d664-4562-5b156f3e6fff:0:0: State change requested > > RUNNING > > --> CANCELLATION_REQUESTED > > 2016-03-24 14:19:01,580 [CONTROL-rpc-event-queue] INFO > > o.a.d.e.w.f.FragmentStatusReporter - > > 290c16d8-4244-d664-4562-5b156f3e6fff:0:0: State to report: > > CANCELLATION_REQUESTED > > 2016-03-24 14:19:01,591 > > [290c16d8-4244-d664-4562-5b156f3e6fff:frag:0:0] > > INFO o.a.d.e.w.fragment.FragmentExecutor - > > 290c16d8-4244-d664-4562-5b156f3e6fff:0:0: State change requested > > CANCELLATION_REQUESTED --> FINISHED > > 2016-03-24 14:19:01,591 > > [290c16d8-4244-d664-4562-5b156f3e6fff:frag:0:0] > > INFO o.a.d.e.w.f.FragmentStatusReporter - > > 290c16d8-4244-d664-4562-5b156f3e6fff:0:0: State to report: CANCELLED > > 2016-03-24 14:19:01,626 [UserServer-1] INFO > > o.a.drill.exec.work.foreman.Foreman - Failure while trying communicate > > query result to initiating client. This would happenif a client is > > disconnected before response notice can be sent. > > org.apache.drill.exec.rpc.ChannelClosedException: null > > at > > org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationCompl > > ete(CoordinationQueue.java:89) > > [drill-rpc-1.6.0.jar:1.6.0] > > at > > org.apache.drill.exec.rpc.CoordinationQueue$RpcListener.operationCompl > > ete(CoordinationQueue.java:67) > > [drill-rpc-1.6.0.jar:1.6.0] > > at > > io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise > > .java:680) [netty-common-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromis > > e.java:603) [netty-common-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise > > .java:563) [netty-common-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java > > :424) [netty-common-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(Abstrac > > tChannel.java:788) [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel. > > java:689) [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChann > > elPipeline.java:1114) [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractCha > > nnelHandlerContext.java:705) > > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractCha > > nnelHandlerContext.java:32) > > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write > > (AbstractChannelHandlerContext.java:980) > > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write > > (AbstractChannelHandlerContext.java:1032) > > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(A > > bstractChannelHandlerContext.java:965) > > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleT > > hreadEventExecutor.java:357) > > [netty-common-4.0.27.Final.jar:4.0.27.Final] > > at > > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) > > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > > at > > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadE > > ventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final] > > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > > 2016-03-24 14:19:01,626 [UserServer-1] WARN > > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to > > FAILED state as query is already at CANCELED state (which is terminal). > > > > > > -----Message d'origine----- > > De : Jason Altekruse [mailto:[email protected]] Envoyé : jeudi 24 mars > > 2016 21:27 À : [email protected] Objet : Re: How to modify > > connection timeout delay ? > > > > This can be set in drill-override.conf, but this should not be an > > issue even for long running queries as we should be sending a > > heartbeat signal back throughout a query's execution, even if the > > query has not yet produced any real data. Can you share the query you > > are running and any errors you can find in the logs? > > > > To adjust the timeout you can set a higher value for the > > drill.exec.user.timeout in conf/drill-override.conf, the value is > > specified in seconds. > > > > Jason Altekruse > > Software Engineer at Dremio > > Apache Drill Committer > > > > On Thu, Mar 24, 2016 at 9:20 AM, COUERON Damien (i-BP - MICROPOLE) < > > [email protected]> wrote: > > > > > Hi, > > > > > > I'm trying to use Drill with the sqlline command line on linux but > > > I'm not able to maintain a connection to my drillbits alive long > > > enough to get the results back. > > > The connection is reset after 30 seconds even while a query is running! > > > > > > I did find the "timeout" variable and set it to -1 but it does not > > > change a thing. > > > > > > The drillbit.log file contains the following lines : > > > 2016-03-24 17:15:37,651 [UserServer-1] INFO > > > o.a.drill.exec.rpc.user.UserServer - RPC connection > > > /39.6.64.15:31010 <--> /39.6.64.22:46641 (user client) timed out. > > > Timeout was set to 30 seconds. Closing connection. > > > > > > Anyone knows where I can modify this setting ? > > > > > > Best regards, > > > Damien > > > > > > > > > -----Message d'origine----- > > > De : Andries Engelbrecht [mailto:[email protected]] > > > Envoyé : jeudi 24 mars 2016 16:47 > > > À : [email protected] > > > Objet : Re: JDBC storage plugin fails > > > > > > I have also seen this fail if the supplied credentials are not valid. > > > > > > With MySQL it can be a bit tricky as credentials can depend on the > > > location of the client, so make sure the MySQL user credentials are > > > for % since Drill is a distributed system and any of the nodes may > > > connect to MySQL. > > > > > > --Andries > > > > > > > On Mar 24, 2016, at 8:39 AM, Christopher Matta <[email protected]> > > wrote: > > > > > > > > Scott, > > > > Could you paste the JSON you're using to define the MySQL storage > > > > plugin here? I've experienced that you get that error when either > > > > the JSON is invalid, or Drill can't find the class you defined. > > > > > > > > I've created a JIRA to improve this functionality: > > > > https://issues.apache.org/jira/browse/DRILL-4533 > > > > > > > > > > > > > > > > On Thursday, March 24, 2016, Wilburn, Scott > > > > <[email protected]> wrote: > > > > > > > >> Hello, > > > >> I am trying to connect to a mysql database using the jdbc storage > > > >> plugin using Drill 1.5 described here: > > > >> https://drill.apache.org/docs/rdbms-storage-plugin/ > > > >> > > > >> When I try to create the plugin through the web UI, I get the > > > >> following vague error: > > > >> "Please retry: error (unable to create/ update storage)" > > > >> > > > >> I tried to see if more information exists in the drillbit logs, > > > >> but didn't see anything relevant. Does anyone know where I can > > > >> find a better description of the problem, or is JDBC just broken > > > >> in 1.5? I found the following Jira with the same description for > > > >> 1.2 (DRILL-3977). Has it been broken since 1.2, or am I doing > > > >> something > > > stupid? > > > >> > > > >> Any help would be appreciated. > > > >> > > > >> Thanks, > > > >> Scott Wilburn > > > >> > > > >> > > > > > > > > -- > > > > Chris Matta > > > > [email protected] > > > > 215-701-3146 > > > > > > > > > > > > __________ L'intégrité de ce message n'étant pas assurée sur > > > Internet, la société i-BP ne peut être tenue responsable de son > > > contenu. Si vous n'êtes pas destinataire de ce message, merci de le > > > détruire et d'avertir l'expéditeur. The integrity of this message > > > cannot be guaranteed on the Internet. The i-BP company cannot > > > therefore be considered responsible for the contents. If you are not > > > the intended recipient of this message, then please delete it and > > > notify the sender. __________ > > > > > > > > > __________ L'intégrité de ce message n'étant pas assurée sur > > Internet, la société i-BP ne peut être tenue responsable de son > > contenu. Si vous n'êtes pas destinataire de ce message, merci de le > > détruire et d'avertir l'expéditeur. The integrity of this message > > cannot be guaranteed on the Internet. The i-BP company cannot > > therefore be considered responsible for the contents. If you are not > > the intended recipient of this message, then please delete it and > > notify the sender. __________ > > > > > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > __________ L'intégrité de ce message n'étant pas assurée sur Internet, la > société i-BP ne peut être tenue responsable de son contenu. Si vous n'êtes > pas destinataire de ce message, merci de le détruire et d'avertir > l'expéditeur. The integrity of this message cannot be guaranteed on the > Internet. The i-BP company cannot therefore be considered responsible for > the contents. If you are not the intended recipient of this message, then > please delete it and notify the sender. __________ > -- Abdelhakim Deneche Software Engineer <http://www.mapr.com/> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
