Hi ,
Thanks Abhishek !
I would like to have a notification of that orphan drillbit process when it
gets disconnected from other running drillbits for some reason , definitely
not because of the unclean shut down as those drill bits are running for
months .
I know I can check the logs and kill that orphaned , which what I did in my
case, but I  would like to have notification for down drillbit.


Thanks,
Divya

On Fri, 13 Jul 2018 at 04:15, Abhishek Girish <[email protected]> wrote:

> Hey Divya,
>
> It would depend on the situation, afaik. The sys.drillbits table contains a
> list of all running drillibits. If one of the Drillbit has issues and
> cannot stay connected to the cluster, I would assume it would be
> unregistered and may not show up in the output of sys.drillbits. If it's an
> intermittent issue and Drillbit process maintains it's heartbeat
> connection, it may show up in the output.
>
> If you take a look at the logs, you might be able to figure out what is
> causing the issue. There may be orphan Drillbit processes which may be have
> left behind due to a previous unclean shutdown. Can you clean up all
> Drillbit processes (using 'ps -ef | grep -i drillbit' and then a kill -9)
> on nodes where you suspect issues and restart Drillbits?
>
> -Abhishek
>
> On Tue, Jul 10, 2018 at 7:16 PM Divya Gehlot <[email protected]>
> wrote:
>
> > Hi ,
> > select * from sys.drillbits;
> > What does above query shows if drillbits process hangs ?
> >
> >
> > Thanks
> >
> > On Tue, 10 Jul 2018 at 15:36, Khurram Faraaz <[email protected]> wrote:
> >
> > > You can run the below query, and look for the *state *column in the
> > result
> > > of the query. Online drillbits will be marked as ONLINE.
> > >
> > > select * from sys.drillbits;
> > >
> > > - Khurram
> > >
> > > On Tue, Jul 10, 2018 at 12:24 AM, Divya Gehlot <
> [email protected]>
> > > wrote:
> > >
> > > > Hi,
> > > > I would like to know the best practice to check the Drillbits status
> in
> > > > cluster mode.
> > > > I have encountered the scenario when check Drillbits process running
> > fine
> > > > and When check in Drll WebUI , some of the Drillbits are down.
> > > > When do RCA(root cause analysis) , got to know due to some reason
> > > drillbits
> > > > process hanged .
> > > > For now the alert system which I have implemented now is checking the
> > > >
> > > >
> > > > > drill/bin/drillbit.sh status
> > > >
> > > >
> > > > Is there any other best way to catch the hung Drillbit process?
> > > > Appreciate the advise from Drill community users.
> > > >
> > > > Thanks,
> > > > Divya
> > > >
> > >
> >
>

Reply via email to