Luke, didn't I send you my fabric scripts from a while back?

Attached is an example which was used for GemFireXD (now not available any
more), but most of the structure remains the same.

Another nice feature is that fabric lets you run jobs in parallel; I've
been able to start up 100+ node clusters this way in under a minute.

--Jens

On Wed, Jun 3, 2015 at 6:01 PM, Abtin Afshar <[email protected]> wrote:

> Hi Randy,
>
> You can actually do a lot with Fabric. I hacked a quick script to download
> gemfire logs, stats, thread dumps and zip them up from any cluster I want
> (dev,sit, uat). Beauty of it is that you only need to install it in you
> local machine (in my case my Linux VM) and it uses ssh under the hood. You
> can also add any python goodness to your script and control your cluster
> with a simple command.
>
> Cheers!
> Abtin
>
>
> > On Jun 3, 2015, at 3:41 PM, Randy May <[email protected]> wrote:
> >
> > Thats funny. I was just looking for something exactly like this to help
> me
> > out with build automation at a client.  Thanks for sharing!
> >
> > On Wed, Jun 3, 2015 at 2:50 PM Luke Shannon <[email protected]> wrote:
> >
> >> I was just working with a client who is using this framework to manage
> all
> >> their distributed geode processes (mainly capturing log and stats files
> for
> >> trouble shooting but also parallel starts to recover from persistence).
> >>
> >> http://www.fabfile.org/
> >>
> >> I have come across tons of custom shell script solutions to do this
> sort of
> >> thing, and have played with Ansible myself (which is great). This one
> look
> >> interesting. You can write Python, but you can also do a DSL that looks
> >> like this:
> >>
> >> from fabric.api import *
> >>
> >>
> >>
> >> env.hosts = ['cache_server1', 'cache_server2']
> >>
> >> env.user = 'my_user'
> >>
> >> env.password = 'my_pass'
> >>
> >>
> >>
> >> def download_log():
> >>
> >>    with settings(warn_only=True):
> >>
> >>                                cd('/gemfire/cache/):
> >>
> >>                                                get('mycache.log')
> >>
> >>
> >>
> >>
> >> --
> >> Luke Shannon | Sr. Field Engineer - Toronto | Pivotal
> >>
> -------------------------------------------------------------------------
> >> Join the Toronto Pivotal Usergroup:
> >> http://www.meetup.com/Toronto-Pivotal-User-Group/
> >>
>
>
# This script represents a fabric command file used to manage GemXD locators
# and servers. Use the 'fab' tool to run it. Read more about fabric here:
# http://www.fabfile.org/
#
# For example, you can start your servers with:
#
#   $> fab start_servers
#
# To start locators and servers, you might do:
#
#   $> fab start_locator start_servers
#
# Fabric is appealing because it allows you to group all of your commands in a
# single file. It has a ton more functionality than is represented here.
#
# I'll typically use this where I need to do repeated tests and I'm cycling my
# cluster very often. It ensures for consistency and repeatability in testing.
#
# Some things to note:
#
#   - Define role names in env.roledefs and then use them in '@roles' annotation.
#   - Tasks are executed serially, on each host defined by a role. Tasks can be
#     executed in parallel with the '@parallel' annotation. Be careful that
#     your code doesn't manipulate the state of 'env' as it is shared across all
#     threads.
#   - Tasks can be run locally with 'local()'.
#   - When starting up a lot of servers I prefer to use 'dtach' and launch in
#     parallel. dtach is like a stripped down version of screen and lets your
#     commands run in detached mode. I have found that other solutions like
#     nohup or just backgrounding don't work well with fabric or simply don't
#     work because of pty weirdness.
#
#     Redirection is a pain with dtach though. Don't try doing:
#     'dtach -n /tmp/sock.dtach foo.sh > /tmp/foo.log'. This will send the
#     output of dtach to the log and not the output of your script. The only
#     solution I have found, so far, is to have the script do the redirection
#     internally. Or if you're trying to launch an exe this way, you need to
#     wrap it in a script first.

from fabric.contrib.files import *
from fabric.contrib.project import *
from fabric.context_managers import *

env.roledefs = {
        'locator': ['node0322'],
        'servers': [ 'node0{0}'.format(x) for x in range(324, 416) ]
}

# These are nodes which are down right now
exclude_hosts = [
        'node0331', 'node0369'
]

for i in exclude_hosts:
    for host_list in env.roledefs.values():
        if i in host_list:
            host_list.remove(i)

java_exe = '/usr/java/jdk1.7.0_51/bin/java'
gfxd = '/home/gemfire/Pivotal_GemFireXD_101_b47290_Linux/bin/gfxd'
locator_port = 19991
locator_str = 'node0322[19991]'
archive_dir = 'archives'

@task
@parallel
@roles('locator', 'servers')
def setup():
    run('yum install -y dtach')


@task
@roles('locator')
def start_locator(cleanup=False):
    work_dir = '/home/gemfire/locator'

    if cleanup:
        run('rm -rf {0}'.format(work_dir))

    if not exists(work_dir):
        run('mkdir -p {0}'.format(work_dir))

    with shell_env(GFXD_JAVA=java_exe):
        run(('{0} locator start '
             '-dir={1} '
             '-peer-discovery-port={2} '
             '-client-bind-address={3} '
             '-jmx-manager-start=true '
             '-J-javaagent:/home/gemfire/jolokia-jvm-1.2.2-agent.jar=host=0.0.0.0').format(
                 gfxd,
                 '/home/gemfire/locator',
                 locator_port,
                 env.host_string))

@task
@roles('locator')
def stop_locator(cleanup=False):
    work_dir = '/home/gemfire/locator'
    if exists(work_dir):
        with shell_env(GFXD_JAVA=java_exe):
            run('{0} locator stop -dir={1}'.format(gfxd, work_dir), warn_only=True)

    if cleanup:
        run('rm -rf {0}'.format(work_dir))

@task
@parallel
@roles('servers')
def stop_servers(cleanup=False):
    work_dir = '/home/gemfire/server'

    if not exists(work_dir):
        run('mkdir -p {0}'.format(work_dir))

    with shell_env(GFXD_JAVA=java_exe):
        run('{0} server stop -dir={1}'.format(gfxd, work_dir), warn_only=True)

    if cleanup:
        run('rm -rf {0}'.format(work_dir))

@task
@parallel
@roles('servers')
def start_servers(cleanup=False):
    work_dir = '/home/gemfire/server'

    if cleanup:
        run('rm -rf {0}'.format(work_dir))

    if not exists(work_dir):
        run('mkdir -p {0}'.format(work_dir))

    with shell_env(GFXD_JAVA=java_exe):
        run(('dtach -n /tmp/gfxd.dtach {gfxd_bin} server start '
             '-dir={work_dir} '
             '-off-heap-memory-size=30g '
             '-J-Xms8g '
             '-J-Xmx8g '
             '-J-Xmn2g '
             '-J-XX:CMSInitiatingOccupancyFraction=70 '
             '-J-XX:+UseCMSInitiatingOccupancyOnly '
             '-J-XX:+CMSClassUnloadingEnabled '
             '-J-XX:+DisableExplicitGC '
             '-J-XX:+PrintGCDetails '
             '-J-XX:+PrintGCTimeStamps '
             '-J-XX:+PrintGCDateStamps '
             '-critical-heap-percentage=94 '
             '-conserve-sockets=false '
             '-statistic-sampling-enabled=true '
             '-statistic-archive-file={work_dir}/statarchive.gfs '
             '-enable-time-statistics=false '
             '-bind-address={host} '
             '-locators={locators}').format(
                gfxd_bin=gfxd,
                work_dir=work_dir,
                host=env.host_string,
                locators=locator_str), warn_only=True)


@task
def shutdown_all():
    local('{0} shut-down-all -locators={1}'.format(gfxd, locator_str))


@task
@parallel
@roles('servers')
def stats():
    with settings(quiet=True):
        local('mkdir -p {0}'.format(archive_dir))
        with cd('/home/gemfire/server'):
            run(('find . -name "*.gfs" -o -name "*.log" '
                 '| tar czvf stats-{0}.tgz -T -').format(env.host_string))
            get('stats-{0}.tgz'.format(env.host_string), archive_dir)


@task
@parallel
@roles('servers')
def kill_servers():
    run('pkill -f -9 Gfxd')

Reply via email to