[ 
https://issues.apache.org/jira/browse/HADOOP-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160057#comment-16160057
 ] 

Allen Wittenauer edited comment on HADOOP-14855 at 9/9/17 7:17 PM:
-------------------------------------------------------------------

jstack (and it's buddy, jps) are problematic.

* jstack can crash/break processes because it uses things like SIGSTOP
* jstack is non-standard 
-- To my knowledge, it is NOT part of the JRE spec.
-- Oracle: "This utility is unsupported and may or may not be available in 
future versions of the JDK."
-- IBM: no such command

I think everyone needs to take a step back and look at this situation in a more 
complete light.  The main problem here is pid cycling.  This is caused by 
starting and stopping a lot of processes.  For things like the NN, RM, etc, 
this just doesn't happen in production.  It does (very rarely) happen on the 
worker nodes (DN, NM), but most experienced operations people know exactly how 
to handle it.  Developers and QA, however, hit this problem a lot.... because 
they're also always cycling processes.  I really don't want this to turn into 
the equivalent of the metrics mess we have in Hadoop.  ("We should expose 
system metric X in the YARN UI!".   Who is actually looking at those values?  
It isn't ops folks--we've got other tools to tell us those things.)

I think it's important for folks to go look at HADOOP-13225.  It really sort of 
exemplifies the power that we've given users over the shell scripts in 3.x.  
Yes, we could require daemontools' setlock.  But then we look like morons on 
operating systems that have other tools (SMF, launchd, systemd, etc.).  If we 
want to get fancy, then we should providing examples of using alternate daemon 
control tools.  

We haven't even covered what happens to the process space in containerized 
systems:

{code}
    root     1     0   0   Aug 25 ?           0:01 /sbin/init
    root  1740  1450   0   Aug 25 ?           0:01 /sbin/init -m verbose
    root  1738  1449   0   Aug 25 ?           0:01 /sbin/init
{code}

Yes, those are separate init processes. Something that shouldn't happen.  
Except they are running in different Illumos zones and this is the view from 
the global zone, which can see (but not touch) all processes from all zones.  
I've never tried it, but I wouldn't be too surprised if lxc's (docker, rkt, 
etc) processes can be seen and touched from the parent since the boundaries are 
much less defined.

...

There is a lot of nuance involved in this problem.  Again, I'm going to double 
down on this is less of an issue for other apps because they don't have our 
Java-induced startup time.  If we weren't using Java (at least with this 
massive classpath), we'd just fire the daemon up and let it sort it out on it's 
own.  But we're not.  So we can't.

That's why this is exactly the space where vendors who support specific 
operating systems should be providing value add by making such controls part of 
their offering.  We should probably add a hadoop-vendor-functions.sh or 
something to make it easier for vendors to override things.

FWIW, I figure for those not using systemd and the like, we'll likely see user 
functions that turn off pid handling and simple init scripts that use  
[daemon|http://libslack.org/daemon/manpages/daemon.1.html] appear relatively 
quickly.  There's the potential for a whole ecosystem of home grown mods like 
this waiting to happen with 3.x.


was (Author: aw):
jstack (and it's buddy, jps) are problematic.

* jstack can crash/break processes because it uses things like SIGSTOP
* jstack is non-standard 
  * To my knowledge, it is NOT part of the JRE spec.
  * Oracle: "This utility is unsupported and may or may not be available in 
future versions of the JDK."
  * IBM: no such command

I think everyone needs to take a step back and look at this situation in a more 
complete light.  The main problem here is pid cycling.  This is caused by 
starting and stopping a lot of processes.  For things like the NN, RM, etc, 
this just doesn't happen in production.  It does (very rarely) happen on the 
worker nodes (DN, NM), but most experienced operations people know exactly how 
to handle it.  Developers and QA, however, hit this problem a lot.... because 
they're also always cycling processes.  I really don't want this to turn into 
the equivalent of the metrics mess we have in Hadoop.  ("We should expose 
system metric X in the YARN UI!".   Who is actually looking at those values?  
It isn't ops folks--we've got other tools to tell us those things.)

I think it's important for folks to go look at HADOOP-13225.  It really sort of 
exemplifies the power that we've given users over the shell scripts in 3.x.  
Yes, we could require daemontools' setlock.  But then we look like morons on 
operating systems that have other tools (SMF, launchd, systemd, etc.).  If we 
want to get fancy, then we should providing examples of using alternate daemon 
control tools.  

We haven't even covered what happens to the process space in containerized 
systems:

{code}
    root     1     0   0   Aug 25 ?           0:01 /sbin/init
    root  1740  1450   0   Aug 25 ?           0:01 /sbin/init -m verbose
    root  1738  1449   0   Aug 25 ?           0:01 /sbin/init
{code}

Yes, those are separate init processes. Something that shouldn't happen.  
Except they are running in different Illumos zones and this is the view from 
the global zone, which can see (but not touch) all processes from all zones.  
I've never tried it, but I wouldn't be too surprised if lxc's (docker, rkt, 
etc) processes can be seen and touched from the parent since the boundaries are 
much less defined.

...

There is a lot of nuance involved in this problem.  Again, I'm going to double 
down on this is less of an issue for other apps because they don't have our 
Java-induced startup time.  If we weren't using Java (at least with this 
massive classpath), we'd just fire the daemon up and let it sort it out on it's 
own.  But we're not.  So we can't.

That's why this is exactly the space where vendors who support specific 
operating systems should be providing value add by making such controls part of 
their offering.  We should probably add a hadoop-vendor-functions.sh or 
something to make it easier for vendors to override things.

FWIW, I figure for those not using systemd and the like, we'll likely see user 
functions that turn off pid handling and simple init scripts that use  
[daemon|http://libslack.org/daemon/manpages/daemon.1.html] appear relatively 
quickly.  There's the potential for a whole ecosystem of home grown mods like 
this waiting to happen with 3.x.

> Hadoop scripts may errantly believe a daemon is still running, preventing it 
> from starting
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14855
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14855
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: scripts
>    Affects Versions: 3.0.0-alpha4
>            Reporter: Aaron T. Myers
>
> I encountered a case recently where the NN wouldn't start, with the error 
> message "namenode is running as process 16769.  Stop it first." In fact the 
> NN was not running at all, but rather another long-running process was 
> running with this pid.
> It looks to me like our scripts just check to see if _any_ process is running 
> with the pid that the NN (or any Hadoop daemon) most recently ran with. This 
> is clearly not a fool-proof way of checking to see if a particular type of 
> daemon is now running, as some other process could start running with the 
> same pid since the daemon in question was previously shut down.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to