(I know this is the SLURM list, but many of the folks here use NHC
with SLURM, so I'm hoping it's not a problem.  If it is, please accept
my humble apologies!)


To all users of Warewulf NHC:

[TL;DR:  NHC is now its own project with a new name, new Git repository
(GitHub AND BitBucket), new mailing lists, and new real-time chat
resources!  New version 1.4.2 has been released as well.  See below
for details and URLs!]


Since its initial public release in early 2012, our work on the Node
Health Check (NHC) tools has been performed and published under the
umbrella of the Warewulf Project (http://warewulf.lbl.gov/).  This was
done for a number of reasons -- sharing of resources, mutual
promotion, etc.  However, this created one very large and unexpected
downside:  user confusion.

You see, many users interpreted "Warewulf Node Health Check" to mean
that it was only intended/suitable for use on cluster nodes managed by
Warewulf, or that it required Warewulf in order to function properly,
or any number of other misunderstandings, the end result of which was
that potential users opted not to give it a try!

Our #1 primary goal since the very start was to build a community of
users around NHC to share and exchange ideas, health checks, tools,
and other code to maximize our collective ability to deliver excellent
service and unsurpassed availability to our customers -- the
scientists and researchers tasked with no less than literally changing
and saving our world, and anything that gets in the way of that goal
is a problem.  A big problem.

So today I am thrilled to announce the creation of an independent
project:  Lawrence Berkeley National Laboratory (LBNL) Node Health
Check.  All development work on what used to be Warewulf NHC has now
moved over to LBNL NHC and will continue under that identity going
forward.  As a result, NHC will no longer be making use of Warewulf
project resources for its primary development activities.

As such, new discussion forums have been created, and users interested
in following or discussing the ongoing development of LBNL Node Health
Check will want to join these groups, either by subscribing to them as
mailing lists or by using the online web forums.  Users of NHC should
join the Users' list ([email protected] or
https://groups.google.com/a/lbl.gov/forum/?hl=en#!forum/nhc), and
those interested in following development or contributing code should
also join the Developers' list ([email protected] or
https://groups.google.com/a/lbl.gov/forum/?hl=en#!forum/nhc-devel).
The Users' list will likely be very low-traffic; the Developers' list
receives development activity notifications, and will therefore see a
bit more traffic, but still no more than a handful of messages each
day.

Perhaps the most significant and most user-visible change is that the
source code repository has been moved over to GitHub.  The front page
for the repository (which also doubles as the project home page and
documentation page) is now at https://github.com/mej/nhc.  (For the
repository URL, just add ".git" on the end, or use
git+ssh://[email protected]/mej/nhc.git if you're a GitHub user.)  For
those who prefer Atlassian's Bitbucket instead, we have an equivalent
site for NHC at https://bitbucket.org/mej0/nhc (git repo
https://bitbucket.org/mej0/nhc.git or
git+ssh://[email protected]:mej0/nhc.git).  This allows anyone wishing
to contribute to NHC development -- by fixing bugs, adding new checks,
updating documentation, writing unit tests, or even enhancing the
default example configuration file -- can now do so quickly and easily
using the facilities of Git and GitHub/Bitbucket rather than having to
e-mail patches around everywhere.  We can finally take advantage of
modern development technologies and collaboration features such as
Issues, Pull Requests, Forks, and more...and I hope many of you will!

Last, but certainly not least, we will be offering multiple options
for those who like to use real-time chat for communications, Q&A, and
troubleshooting assistance.  For the old-school folks who prefer
strictly text-based chat, traditional IRC will continue to be
available; we have created the channel #lbnl-nhc on irc.freenode.net.
Users may also elect to use Gitter instead; GitHub users can access
https://gitter.im/mej/nhc with their GitHub credentials.  For those
familiar with Slack, invitations are available to the NHC Slack
instance (at https://mej.slack.com/messages/nhc/) by contacting me
privately.  And finally we are testing out Ryver (ryver.com) as a
possible alternative to Slack, so if you're interested in
using/helping test our Ryver instance (at
https://lbnl.ryver.com/index.html#channel/4), let me know!

To top it all off, we've released version 1.4.2 of LBNL Node Health
Check to kick things off right!  This release offers a couple new
checks, new features for some of the existing checks, and of course
completely updated/refactored MarkDown-based documentation for the
move to GitHub.  As you might expect, the packages are now named
"lbnl-nhc" instead of "warewulf-nhc;" triggers have been added to the
RPMs which will seamlessly handle renaming the upstream scripts files
(in /etc/nhc/scripts/*.nhc), but users installing from the tarball may
need to rename some files by hand.  To access the source tarballs
and/or RPMs for this release, you can download them either from GitHub
(https://github.com/mej/nhc/releases/tag/1.4.2) or from JFrog's
Artifactory instance, BinTray (https://bintray.com/lbnl/nhc-src and
https://bintray.com/lbnl/nhc-rpm).

Whew, that's a lot of information!  If you have any questions, please
feel free to reply to this e-mail or join one of the above resources
to discuss!  And for those attending the Supercomputing 2015
conference in Austin next week, I'll be there as well and plan to
attend the SLURM BoF, and I look forward to the opportunity to chat
with you there!


Best regards,

Michael
LBNL NHC Project Lead Developer
Warewulf Project Developer

-- 
Michael Jennings <[email protected]>
Senior HPC Systems Engineer
High-Performance Computing Services
Lawrence Berkeley National Laboratory
Bldg 50B-3209E        W: 510-495-2687
MS 050B-3209          F: 510-486-8615

Reply via email to