Another thing to consider is ensure you have a Spill Location setup, and
then disable hashagg/hashjoin for the query...

On Wed, Mar 1, 2017 at 1:25 PM, Abhishek Girish <[email protected]> wrote:

> Hey Anup,
>
> This is indeed an issue, and I can understand that having an unstable
> environment is not something anyone wants. DRILL-4708 is still unresolved -
> hopefully someone will get to it soon. I've bumped up the priority.
>
> Unfortunately we do not publish any sizing guidelines, so you'd have to
> experiment to settle on the right load for your cluster. Please decrease
> the concurrency (number of queries running in parallel). And try bumping up
> Drill DIRECT memory. Also, please set the system options recommended by
> Sudheesh. While this may not solve the issue, it may help reduce it's
> occurrence.
>
> Can you also update the JIRA with your configurations, type of queries and
> the relevant logs?
>
> -Abhishek
>
> On Wed, Mar 1, 2017 at 10:17 AM, Anup Tiwari <[email protected]>
> wrote:
>
> > Hi,
> >
> > Can someone look into it? As we are now getting this more frequently in
> > Adhoc queries as well.
> > And for automation jobs, we are moving to Hive as in drill we are getting
> > this more frequently.
> >
> > Regards,
> > *Anup Tiwari*
> >
> > On Sat, Dec 31, 2016 at 12:11 PM, Anup Tiwari <[email protected]
> >
> > wrote:
> >
> > > Hi,
> > >
> > > We are getting this issue bit more frequently. can someone please look
> > > into it and tell us that why it is happening since as mention in
> earlier
> > > mail when this query gets executed no other query is running at that
> > time.
> > >
> > > Thanks in advance.
> > >
> > > Regards,
> > > *Anup Tiwari*
> > >
> > > On Sat, Dec 24, 2016 at 10:20 AM, Anup Tiwari <
> [email protected]
> > >
> > > wrote:
> > >
> > >> Hi Sudheesh,
> > >>
> > >> Please find below ans :-
> > >>
> > >> 1. Total 4,(3 Datanodes, 1 namenode)
> > >> 2. Only one query, as this query is part of daily dump and runs in
> early
> > >> morning.
> > >>
> > >> And as @chun mentioned , it seems similar to DRILL-4708 , so any
> update
> > >> on progress of this ticket?
> > >>
> > >>
> > >> On 22-Dec-2016 12:13 AM, "Sudheesh Katkam" <[email protected]>
> > wrote:
> > >>
> > >> Two more questions..
> > >>
> > >> (1) How many nodes in your cluster?
> > >> (2) How many queries are running when the failure is seen?
> > >>
> > >> If you have multiple large queries running at the same time, the load
> on
> > >> the system could cause those failures (which are heartbeat related).
> > >>
> > >> The two options I suggested decrease the parallelism of stages in a
> > >> query, this implies lesser load but slower execution.
> > >>
> > >> System level option affect all queries, and session level affect
> queries
> > >> on a specific connection. Not sure what is preferred in your
> > environment.
> > >>
> > >> Also, you may be interested in metrics. More info here:
> > >>
> > >> http://drill.apache.org/docs/monitoring-metrics/ <
> > >> http://drill.apache.org/docs/monitoring-metrics/>
> > >>
> > >> Thank you,
> > >> Sudheesh
> > >>
> > >> > On Dec 21, 2016, at 4:31 AM, Anup Tiwari <[email protected]
> >
> > >> wrote:
> > >> >
> > >> > @sudheesh, yes drill bit is running on datanodeN/10.*.*.5:31010).
> > >> >
> > >> > Can you tell me how this will impact to query and do i have to set
> > this
> > >> at
> > >> > session level OR system level?
> > >> >
> > >> >
> > >> >
> > >> > Regards,
> > >> > *Anup Tiwari*
> > >> >
> > >> > On Tue, Dec 20, 2016 at 11:59 PM, Chun Chang <[email protected]>
> > >> wrote:
> > >> >
> > >> >> I am pretty sure this is the same as DRILL-4708.
> > >> >>
> > >> >> On Tue, Dec 20, 2016 at 10:27 AM, Sudheesh Katkam <
> > >> [email protected]>
> > >> >> wrote:
> > >> >>
> > >> >>> Is the drillbit service (running on datanodeN/10.*.*.5:31010)
> > actually
> > >> >>> down when the error is seen?
> > >> >>>
> > >> >>> If not, try lowering parallelism using these two session options,
> > >> before
> > >> >>> running the queries:
> > >> >>>
> > >> >>> planner.width.max_per_node (decrease this)
> > >> >>> planner.slice_target (increase this)
> > >> >>>
> > >> >>> Thank you,
> > >> >>> Sudheesh
> > >> >>>
> > >> >>>> On Dec 20, 2016, at 12:28 AM, Anup Tiwari <
> > [email protected]
> > >> >
> > >> >>> wrote:
> > >> >>>>
> > >> >>>> Hi Team,
> > >> >>>>
> > >> >>>> We are running some drill automation script on a daily basis and
> we
> > >> >> often
> > >> >>>> see that some query gets failed frequently by giving below error
> ,
> > >> >> Also i
> > >> >>>> came across DRILL-4708 <https://issues.apache.org/
> > >> >> jira/browse/DRILL-4708
> > >> >>>>
> > >> >>>> which seems similar, Can anyone give me update on that OR
> > workaround
> > >> to
> > >> >>>> avoid such issue ?
> > >> >>>>
> > >> >>>> *Stack Trace :-*
> > >> >>>>
> > >> >>>> Error: CONNECTION ERROR: Connection /10.*.*.1:41613 <-->
> > >> >>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly.
> > Drillbit
> > >> >>> down?
> > >> >>>>
> > >> >>>>
> > >> >>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
> (state=,code=0)
> > >> >>>> java.sql.SQLException: CONNECTION ERROR: Connection
> /10.*.*.1:41613
> > >> >> <-->
> > >> >>>> datanodeN/10.*.*.5:31010 (user client) closed unexpectedly.
> Drillb
> > >> >>>> it down?
> > >> >>>>
> > >> >>>>
> > >> >>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(
> > >> >>> DrillCursor.java:232)
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(
> > >> >>> DrillCursor.java:275)
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
> > >> >>> DrillResultSetImpl.java:1943)
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(
> > >> >>> DrillResultSetImpl.java:76)
> > >> >>>>       at
> > >> >>>> org.apache.calcite.avatica.AvaticaConnection$1.execute(
> > >> >>> AvaticaConnection.java:473)
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(
> > >> >>> DrillMetaImpl.java:465)
> > >> >>>>       at
> > >> >>>> org.apache.calcite.avatica.AvaticaConnection.
> > >> >> prepareAndExecuteInternal(
> > >> >>> AvaticaConnection.java:477)
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillConnectionImpl.
> > >> >>> prepareAndExecuteInternal(DrillConnectionImpl.java:169)
> > >> >>>>       at
> > >> >>>> org.apache.calcite.avatica.AvaticaStatement.executeInternal(
> > >> >>> AvaticaStatement.java:109)
> > >> >>>>       at
> > >> >>>> org.apache.calcite.avatica.AvaticaStatement.execute(
> > >> >>> AvaticaStatement.java:121)
> > >> >>>>       at
> > >> >>>> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(
> > >> >>> DrillStatementImpl.java:101)
> > >> >>>>       at sqlline.Commands.execute(Commands.java:841)
> > >> >>>>       at sqlline.Commands.sql(Commands.java:751)
> > >> >>>>       at sqlline.SqlLine.dispatch(SqlLine.java:746)
> > >> >>>>       at sqlline.SqlLine.runCommands(SqlLine.java:1651)
> > >> >>>>       at sqlline.Commands.run(Commands.java:1304)
> > >> >>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> > Method)
> > >> >>>>       at
> > >> >>>> sun.reflect.NativeMethodAccessorImpl.invoke(
> > >> >>> NativeMethodAccessorImpl.java:62)
> > >> >>>>       at
> > >> >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > >> >>> DelegatingMethodAccessorImpl.java:43)
> > >> >>>>       at java.lang.reflect.Method.invoke(Method.java:498)
> > >> >>>>       at
> > >> >>>> sqlline.ReflectiveCommandHandler.execute(
> > >> >> ReflectiveCommandHandler.java:
> > >> >>> 36)
> > >> >>>>       at sqlline.SqlLine.dispatch(SqlLine.java:742)
> > >> >>>>       at sqlline.SqlLine.initArgs(SqlLine.java:553)
> > >> >>>>       at sqlline.SqlLine.begin(SqlLine.java:596)
> > >> >>>>       at sqlline.SqlLine.start(SqlLine.java:375)
> > >> >>>>       at sqlline.SqlLine.main(SqlLine.java:268)
> > >> >>>> Caused by: org.apache.drill.common.exceptions.UserException:
> > >> >> CONNECTION
> > >> >>>> ERROR: Connection /10.*.*.1:41613 <--> datanodeN/10.*.*.5:31010
> > (user
> > >> >>>> client) closed unexpectedly. Drillbit down?
> > >> >>>>
> > >> >>>>
> > >> >>>> [Error Id: 5089f2f1-0dfd-40f8-9fa0-8276c08be53f ]
> > >> >>>>       at
> > >> >>>> org.apache.drill.common.exceptions.UserException$
> > >> >>> Builder.build(UserException.java:543)
> > >> >>>>       at
> > >> >>>> org.apache.drill.exec.rpc.user.QueryResultHandler$
> > >> >>> ChannelClosedHandler$1.operationComplete(QueryResultHandler.
> > java:373)
> > >> >>>>       at
> > >> >>>> io.netty.util.concurrent.DefaultPromise.notifyListener0(
> > >> >>> DefaultPromise.java:680)
> > >> >>>>       at
> > >> >>>> io.netty.util.concurrent.DefaultPromise.notifyListeners0(
> > >> >>> DefaultPromise.java:603)
> > >> >>>>       at
> > >> >>>> io.netty.util.concurrent.DefaultPromise.notifyListeners(
> > >> >>> DefaultPromise.java:563)
> > >> >>>>       at
> > >> >>>> io.netty.util.concurrent.DefaultPromise.trySuccess(
> > >> >>> DefaultPromise.java:406)
> > >> >>>>       at
> > >> >>>> io.netty.channel.DefaultChannelPromise.trySuccess(
> > >> >>> DefaultChannelPromise.java:82)
> > >> >>>>       at
> > >> >>>> io.netty.channel.AbstractChannel$CloseFuture.
> > >> >> setClosed(AbstractChannel.
> > >> >>> java:943)
> > >> >>>>       at
> > >> >>>> io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(
> > >> >>> AbstractChannel.java:592)
> > >> >>>>       at
> > >> >>>> io.netty.channel.AbstractChannel$AbstractUnsafe.close(
> > >> >>> AbstractChannel.java:584)
> > >> >>>>       at
> > >> >>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.cl
> > >> oseOnRead(
> > >> >>> AbstractNioByteChannel.java:71)
> > >> >>>>       at
> > >> >>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.
> > >> >>> handleReadException(AbstractNioByteChannel.java:89)
> > >> >>>>       at
> > >> >>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
> > >> >>> AbstractNioByteChannel.java:162)
> > >> >>>>       at
> > >> >>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(
> > >> >>> NioEventLoop.java:511)
> > >> >>>>       at
> > >> >>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
> > >> >>> NioEventLoop.java:468)
> > >> >>>>       at
> > >> >>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(
> > >> >>> NioEventLoop.java:382)
> > >> >>>>       at io.netty.channel.nio.NioEventL
> > >> oop.run(NioEventLoop.java:354)
> > >> >>>>       at
> > >> >>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.
> > >> >>> run(SingleThreadEventExecutor.java:111)
> > >> >>>>       at java.lang.Thread.run(Thread.java:745)
> > >> >>>>
> > >> >>>>
> > >> >>>> Regards,
> > >> >>>> *Anup Tiwari*
> > >> >>>
> > >> >>>
> > >> >>
> > >>
> > >>
> > >>
> > >
> >
>

Reply via email to