Re: adding HA monitoring to bigtop

Roman Shaposhnik Fri, 12 Oct 2012 15:18:23 -0700

On Thu, Oct 11, 2012 at 4:27 PM, Jos Backus <[email protected]> wrote:
> I'm still working on this, now in the context of HDP. My goal at work is to
> ultimately use the notify functionality in daemontools-encore to send
> program crash/restart alerts, and change the various wrapper scripts
> (start-dfs.sh, etc.) to use daemontools-encore (and shmux), per the ticket.


Jos, is there any chance to see glimpses of your code? It doesn't have
to be complete but it would be very useful as a starting point. Just like
what is happening on BIGTOP-713 right now.

> It would be great to be able to deprecate the init scripts completely
> long-term and replace them with hooks for process supervision systems such
> as daemontools-encore, which is very portable, or for the most popular
> process supervision tools out there (Upstart, SMF, launchd).

I think the unfortunate complication here is that we don't have a luxury
of converging on a single service management system. We will be always
supporting a number of them in Bigtop. And most likely init.d scripts
will stick around for at least as long as RHEL5 is sticking around.

So the question then becomes -- do we strive for the lowest common
denominator that works across all the distros we care about or do
we provide hooks for *all* of these systems.

Thoughts?

> Also, right now a major source of trouble is the mess that are the 
> startup/wrapper
> scripts for Hadoop, because of the plethora of global environment variables 
> and
> shell code that sets/reads those variables.

While working on Bigtop I've come to realize that the needs of upstream
developers might actually be very different from the needs of downstream
DevOPS. IOW, it may not be out of the question for us to completely bypass
upstream service management scritpts and replace them with our own
implementations. Essentially for things like systemd we'd have to that
anyway.

If we embark on this project we have a chance of completely unifying
how things are done -- at the end of the day all services in Hadoop
ecosystem end up in java/jsvc invocation with certain env. vars
and arguments passed to the JVM.

> There's also the continuing issue that the various organizations/vendors
> can't seem to make their minds up about the script UIs; there are the
> hadoop, mapred, hdfs, yarn, hadoop-daemon.sh and hadoop-daemons.sh
> commands, all which source various config files and use a ton of global
> variables. It's unclear which ones to use such that the right configuration
> is applied. So my plan is to use the most low-level interface and stick all
> needed environment variables in /service/foo/env/... so they are easy to
> find, query and set in a platform-independent manner.

Bigtop is the place for this to be resolved. I think if we reasonable job
of unifying this things vendors will follow.

Basically, at this point the real issue is having enough folks interested
in improving the situation and willing to post patches to the JIRAs.
Personally, I plan to work on:
    https://issues.apache.org/jira/browse/BIGTOP-460
    https://issues.apache.org/jira/browse/BIGTOP-263
but I would surely benefit from any help I can get.

Thanks,
Roman.

Re: adding HA monitoring to bigtop

Reply via email to