Re: [Beowulf] Question about fair share

2022-01-24 Thread Skylar Thompson
On Mon, Jan 24, 2022 at 01:17:30PM -0600, Tom Harvill wrote:
> 
> 
> Hello,
> 
> We use a 'fair share' feature of our scheduler (SLURM) and have our decay
> half-life (the time needed for priority penalty to halve) set to 30 days. 
> Our maximum job runtime is 7 days.  I'm wondering what others use, please
> let me know if you can spare a minute.  Thank you!

We're a Grid Engine shop, not SLURM, but a few years ago we significantly
reduced the weight of the fair-share policy and boosted the relative weight
of the functional policy. The problem we were having was that the 
fair-share policy would take a long time to adjust to sudden changes in
usage and trying to determine what someone's priority would be/should be
based on prior usage could be pretty challenging. The functional policy
adjusts immediately based on current workload and is a lot easier to
comprehend for our users.

I'm not sure what the equivalent of the functional policy is in SLURM but
in GE it's ticket-based where accounts, projects, and "departments" (labs,
in our context) are given some number of tickets which are consumed by
running jobs, and returned when the job finishes. By default, every job
from a single source has an equal share of tickets, but that share is
adjustable on submission so a user can assign a relative importance to
their own jobs.

We also use the urgency policy heavily, where the resource requests
of a job influence its final priority. This lets us boost the priority for
jobs requesting hard-to-satisfy resources (lots of memory on one node,
GPUs, etc.) to avoid starving them amongst a swarm of tiny jobs.

Schedule policy is a really iterative process and took us a long time to
tweak to everyone's (mostly) satisfaction.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [EXTERNAL] server lift

2021-10-22 Thread Skylar Thompson
On Fri, Oct 22, 2021 at 10:24:47AM -0700, David Mathog wrote:
> On Thu, 21 Oct 2021 15:00:22 -0400 Prentice Bisbal wrote:
> 
> > We have one of these where I work:
> >
> > https://serverlift.com/data-center-lifts/sl-350x/
> 
> Wish I had had something like that.
> 
> The only downside to that unit I can see is that the crank is on the
> opposite side of the unit from where the lifted computer would be.  In a
> narrow aisle situation a person working alone would have to walk all the
> way around to reach the other side to install/remove the server after
> having set the height.

Yep, that's basically the GenieLift system we have, though maybe has a bit
longer base. Having a load platform is really nice, and the only problem we
have is having the crank on the other side from the platform.

With a little bit of finagling, one person can actually get a system in
and out of rails by raising (to remove) or lowering (to install) the
platform and moving the system a bit by hand. With the platform in place,
even if the system falls, it won't fall more than a 1/4" or so.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] server lift

2021-10-18 Thread Skylar Thompson
We're using a GL-8 lift, which can fit down our ~2'-wide cold aisles. It
can be a little bit awkward but definitely keeps the workplace safety
people happy.

On Mon, Oct 18, 2021 at 10:32:31AM -0400, Michael Di Domenico wrote:
> we're using an older genie lift as a server lift currently.  which as
> you can guess isn't designed for servers.  the most recent set of
> compute nodes we purchased are pretty much impossible to lift by man
> (even if that was a good idea) and the genie lift is getting awkward
> in our cramped data center given it's design and server weight.
> 
> i can certainly google for one, but they all look great in the
> glossies.  does anyone want to provide some real world info?
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Data Destruction

2021-09-29 Thread Skylar Thompson
In this case, we've successfully pushed back with the granting agency (US NIH,
generally, for us) that it's just not feasible to guarantee that the data
are truly gone on a production parallel filesystem. The data are encrypted
at rest (including offsite backups), which has been sufficient for our
purposes. We'll then just use something like GNU shred(1) to do a
best-effort secure delete.

In addition to RAID, other confounding factors to be aware of are snapshots
and cached data.

On Wed, Sep 29, 2021 at 10:52:33AM -0400, Paul Edmon via Beowulf wrote:
> I guess the question is for a parallel filesystem how do you make sure you
> have 0'd out the file with out borking the whole filesystem since you are
> spread over a RAID set and could be spread over multiple hosts.
> 
> -Paul Edmon-
> 
> On 9/29/2021 10:32 AM, Scott Atchley wrote:
> > For our users that have sensitive data, we keep it encrypted at rest and
> > in movement.
> > 
> > For HDD-based systems, you can perform a secure erase per NIST
> > standards. For SSD-based systems, the extra writes from the secure erase
> > will contribute to the wear on the drives and possibly their eventually
> > wearing out. Most SSDs provide an option to mark blocks as zero without
> > having to write the zeroes. I do not think that it is exposed up to the
> > PFS layer (Lustre, GPFS, Ceph, NFS) and is only available at the ext4 or
> > XFS layer.
> > 
> > On Wed, Sep 29, 2021 at 10:15 AM Paul Edmon  > > wrote:
> > 
> > The former.  We are curious how to selectively delete data from a
> > parallel filesystem.  For example we commonly use Lustre, ceph,
> > and Isilon in our environment.  That said if other types allow for
> > easier destruction of selective data we would be interested in
> > hearing about it.
> > 
> > -Paul Edmon-
> > 
> > On 9/29/2021 10:06 AM, Scott Atchley wrote:
> > > Are you asking about selectively deleting data from a parallel
> > > file system (PFS) or destroying drives after removal from the
> > > system either due to failure or system decommissioning?
> > > 
> > > For the latter, DOE does not allow us to send any non-volatile
> > > media offsite once it has had user data on it. When we are done
> > > with drives, we have a very big shredder.
> > > 
> > > On Wed, Sep 29, 2021 at 9:59 AM Paul Edmon via Beowulf
> > > mailto:beowulf@beowulf.org>> wrote:
> > > 
> > > Occassionally we get DUA (Data Use Agreement) requests for
> > > sensitive
> > > data that require data destruction (e.g. NIST 800-88). We've
> > > been
> > > struggling with how to handle this in an era of distributed
> > > filesystems
> > > and disks.  We were curious how other people handle requests
> > > like this?
> > > What types of filesystems to people generally use for this
> > > and how do
> > > people ensure destruction?  Do these types of DUA's preclude
> > > certain
> > > storage technologies from consideration or are there creative
> > > ways to
> > > comply using more common scalable filesystems?
> > > 
> > > Thanks in advance for the info.
> > > 
> > > -Paul Edmon-
> > > 
> > > ___
> > > Beowulf mailing list, Beowulf@beowulf.org
> > >  sponsored by Penguin Computing
> > > To change your subscription (digest mode or unsubscribe)
> > > visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> > > 
> > > 

> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Data Destruction

2021-09-29 Thread Skylar Thompson
We have one storage system (DDN/GPFS) that is required to be
NIST-compliant, and we bought self-encrypting drives for it. The up-charge
for SED drives has diminished significantly over the past few years so that
might be easier than doing it in software and then having to verify/certify
that the software is encrypting everything that it should be.

On Wed, Sep 29, 2021 at 09:58:58AM -0400, Paul Edmon via Beowulf wrote:
> Occassionally we get DUA (Data Use Agreement) requests for sensitive data
> that require data destruction (e.g. NIST 800-88). We've been struggling with
> how to handle this in an era of distributed filesystems and disks.  We were
> curious how other people handle requests like this?  What types of
> filesystems to people generally use for this and how do people ensure
> destruction?  Do these types of DUA's preclude certain storage technologies
> from consideration or are there creative ways to comply using more common
> scalable filesystems?
> 
> Thanks in advance for the info.
> 
> -Paul Edmon-
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Power Cycling Question

2021-07-16 Thread Skylar Thompson
One problem with suspend/sleep is if you have services that depend on
persistent TCP connections. I don't know that GPFS (er, sorry, "Spectrum
Scale"), for instance, would be consistently tolerant of its daemon
connections being interrupted, even if the node in question wasn't actually
doing any I/O.

We had tried engineering our custom "green cluster" automation with Grid
Engine years ago where we would shutdown idle nodes until they were needed,
but doing it independently of the resource manager was far too complicated
for us to maintain, especially since it was all cost and no benefit for us
with our power and cooling charges being absorbed through a flat overhead
rate. This might not be as big of an issue for schedulers/resource managers
that have fewer requestable resources than GE, and for sites where they
are billed for actual power/cooling used and can more easily justify the
staff time to manage the extra complexity.

On Sat, Jul 17, 2021 at 12:43:27AM +0100, Jörg Saßmannshausen wrote:
> Hi Doug,
> 
> interesting topic and quite apt when I look at the flooding in Germany, 
> Belgian and The Netherlands. 
> 
> I guess there are a number of reasons why people are not doing it. Discarding 
> the usual "we never done that" one, I guess the main problem is: when do you 
> want  to turn it off? After 5 mins being idle? Maybe 10 mins? One hour? How 
> often do you then need to boot them up again and how much energy does that 
> cost? From chatting to a few people who tried it in the past it somehow 
> transpired that you do not save as much energy as you were hoping for. 
> 
> However, on thing came to my mind: is it possible to simply suspend it to 
> disc 
> and then let it be sleeping? That way, you wake the node up quicker and 
> probably need less power when it is suspended. Think of laptops. 
> 
> The other way around would simply be: we know in say the summer, there is 
> less 
> demand so we simply turn X number of nodes off and might do some maintenance 
> on them. So you are running the whole cluster for say 6 weeks with limited 
> capacity. That might mean a few jobs are queuing but that also will give us a 
> window to do things. Once people are coming back, the maintenance is done and 
> the cluster can run at full capacity again. 
> 
> Just some (crazy?) ideas.
> 
> All the best
> 
> Jörg
> 
> Am Freitag, 16. Juli 2021, 20:35:11 BST schrieb Douglas Eadline:
> > Hi everyone:
> > 
> > Reducing power use has become an important topic. One
> > of the questions I always wondered about is
> > why more cluster do not turn off unused nodes. Slurm
> > has hooks to turn nodes off when not in use and
> > turn them on when resources are needed.
> > 
> > My understanding is that power cycling creates
> > temperature cycling, that then leads to premature node
> > failure. Makes sense and has anyone ever studied/tested
> > this ?
> > 
> > The only other reason I can think of is that the delay
> > in server boot time makes job starts slow or power
> > surge issues.
> > 
> > I'm curious about other ideas or experiences.
> > 
> > Thanks
> > 
> > --
> > Doug
> 
> 
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Odd NFS write issue for commands issued in a script

2020-12-11 Thread Skylar Thompson
Is it possible that /usr/common/tmp/outfile.txt already exists, and the
shell has noclobber set?

On Tue, Dec 08, 2020 at 05:30:14PM -0800, David Mathog wrote:
> Can anybody suggest why a script which causes writes to an NFS mounted
> directory like so
> 
>ssh remotenode 'command >/usr/common/tmp/outfile.txt'
> 
> could somehow fail that write silently, but this variant
> 
>ssh remotenode 'command >/tmp/outfile; mv /tmp/outfile
> /usr/common/tmp/outfile.txt'
> 
> would always succeed?
> 
> (Actually it is slightly more complicated than this because
> the whole command string shown above is constructed and then run in another
> program within a system() call.  Initially this turned up inside a threaded
> version, but it does it even with a straight system() call.  I cannot
> reproduce this problem by running the ssh commands from the command line, it
> only happens inside the script.  The files so far have been relatively
> small, less than 50kb.  "command" is a run of the NCBI blastn program,
> although that is probably irrelevant.)
> 
> I have even seen this happen:
> 
>ssh remotenode 'command >/usr/common/tmp/outfile.txt; ls -al
> /usr/common/tmp/outfile.txt'
>ls -al /usr/common/tmp/outfile.txt
> 
> where the first ls (running on the remote node) shows the output file while
> the second (running on the NFS server) does not.
> 
> This is on a CentOS 7 system.  The server was last updated 8 days ago but
> the compute nodes have not been updated in almost a year.
> 
> Server kernel is  3.10.0-1160.6.1.el7.x86_64
> Client kernel is  3.10.0-1062.12.1.el7.x86_64
> 
> There are no error messages in stderr, /var/log/messages, or dmesg.
> 
> The client's fstab has:
> 
>   server:/usr/common   /usr/common nfs bg,hard,intr,rw 1   1
> 
> and the server's /etc/exports has:
> 
>   /usr/common  *.cluster(rw,sync,no_root_squash)
> 
> 
> Thanks,
> 
> David Mathog
> mat...@caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] ***UNCHECKED*** Re: OT, RAID controller replacement batteries?

2020-11-04 Thread Skylar Thompson
On Wed, Nov 04, 2020 at 09:54:36AM -0800, David Mathog wrote:
> 
> on Mon, 2 Nov 2020 12:31:39 Skylar Thompson wrote:
> 
> > We've had the same problems, one somewhat-effective trick has been to
> > scavenge working batteries from systems we're sending to surplus so we have
> > our own supply of batteries to swap in. The failure rate is marginally
> > better than the batteries we've bought from Amazon/Newegg/eBay (as you
> > note, not great).
> 
> Did you try the no name ones too or only the ancient "New" Dell batteries?
> Buying those Dells is a bit like purchasing OEM tires for a car with 60k
> miles on them - it is the right tire, but...
> 
> Given the amount of iffy stuff floating around on ebay my first thought was
> that the Dell ones were probably fakes, but assuming the images are of the
> actual battery, why create a "new" product with a decade old time stamp?
> 
> This is yet another one of those situations where a manufacturer's custom
> batteries eventually cause grief.  From the form factor, voltage, and specs,
> these batteries appear to be very similar to a phone lithium ion
> battery, but with a different connector.  Basically like this:
> 
> https://www.amazon.com/LG-LGIP-520B-Lithium-Phone-Battery/dp/B0015A4TQK/ref=sr_1_4?dchild=1=3.7v+lithium+ion+battery=1604510929
> 
> The NU209 has a 5 pin connector, and the card can identify the battery.
> There must be some electronics in there (besides +,-, and Thermistor pins,
> although there could be two or more ground pins.)

These were all the ancient "like-new brand-name" Dell batteries, figure I
don't want to be the one that sets off the FM200 when the really dodgy
battery catches fire...

For a lot of our nodes, the local disks see very limited use so we have
some nodes that just don't have write-cache enabled on their local disks.
In other cases, it's the nail in the coffin that lets us actually justify
hardware retirement with a lab.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] OT, RAID controller replacement batteries?

2020-11-02 Thread Skylar Thompson
We've had the same problems, one somewhat-effective trick has been to
scavenge working batteries from systems we're sending to surplus so we have
our own supply of batteries to swap in. The failure rate is marginally
better than the batteries we've bought from Amazon/Newegg/eBay (as you
note, not great).

On Mon, Nov 02, 2020 at 12:19:34PM -0800, David Mathog wrote:
> OT.
> 
> Anybody replaced a Perc 6i (or equivalent) BBU battery lately?  What did you
> use?
> 
> Because...
> 
> We have a bunch of older servers with Dell Perc RAID controllers.  The BBU
> batteries are starting to go, one by one.  For some of them Dell still has
> new batteries, but for the Perc 6i they seem to have only "refurbished"
> NU209, which literally means they pulled it out of some server they had
> around for whatever reason and are selling it.  Apparently
> dusting it off counts as "refurbishing".  If one goes to ebay and searches
> for "new" NU209 it pulls up a bunch of images of FR463 (alternate Dell part
> number?) batteries.  This is one of the newer ones:
> 
> https://www.ebay.com/p/1703443656?iid=223662829408
> 
> The labels show manufacturing dates all prior to 2014 (so far), most in the
> 2010 to 2011 era.  So not by any definition "new".
> 
> There are some generic NU209 batteries, usually marked as 0NU209, but none
> of  those show the manufacturer's name, the manufacturing date, or the mAh
> specification.  See for instance the pictures here:
> 
> https://www.ebay.com/itm/New-Battery-FR463-NU209-For-DELL-PERC-H700-H800-5i-6i-R900-RAID-Controller-US/291976607061?hash=item43fb297555:g:-PYAAOSwA29Y5SDi
> 
> The old Dell batteries seem like a terrible idea, but I'm not thrilled with
> the idea of plugging in one of the completely uncharacterized batteries
> either.  Probably the best idea is to replace the old machines, but failing
> that, are there any batteries out there which are both new and reliable
> enough to buy?
> 
> Thanks,
> 
> David Mathog
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] SLURM - Where is this exit status coming from?

2020-08-13 Thread Skylar Thompson
Hmm, apparently math is hard today. I of course meant 2^7, not 2^8.

On Thu, Aug 13, 2020 at 02:37:46PM -0700, Skylar Thompson wrote:
> I think this is an artifact of the job process running as a child process of
> the job script, where POSIX defines the low-order 8 bits of the process
> exit code as indicating which signal the child process received when it 
> exited.
> 
> As others noted, 137 is 2^8+9, where 9 is SIGKILL (exceeding memory, also
> exceeding the runtime request at least in the Grid Engine world).
> 
> On Thu, Aug 13, 2020 at 02:24:49PM -0700, Alex Chekholko via Beowulf wrote:
> > This may be a "cargo cult" answer from old SGE days but IIRC "137" was
> > "128+9" and it means the process got signal 9 which means _something_ sent
> > it a SIGKILL.
> > 
> > On Thu, Aug 13, 2020 at 2:22 PM Prentice Bisbal via Beowulf <
> > beowulf@beowulf.org> wrote:
> > 
> > > I think you dialed the wrong number. We're the Beowulf people! Although,
> > > I'm sure we can still help you. ;)
> > >
> > > --
> > > Prentice
> > > On 8/13/20 4:14 PM, Altemara, Anthony wrote:
> > >
> > > Cheers SLURM people,
> > >
> > >
> > >
> > > We’re seeing some intermittent job failures in our SLURM cluster, all with
> > > the same 137 exit code. I’m having difficulty in determining whether this
> > > error code is coming from SLURM (timeout?) or the Linux OS (process 
> > > killed,
> > > maybe memory).
> > >
> > >
> > >
> > > In this example, there’s the WEXITSTATUS in the slurmctld.log, error:0
> > > status 35072 in the slurd.log, and ExitCode 9:0 in the accounting log….???
> > >
> > >
> > >
> > > Does anyone have insight into  how all these correlate? I’ve spent a
> > > significant amount of time digging  through the documentation, and I don’t
> > > see a clear way on how to interpret all these…
> > >
> > >
> > >
> > >
> > >
> > > Example: Job: 62791
> > >
> > >
> > >
> > > [root@X]  /var/log/slurm# grep -ai jobid=62791 slurmctld.log
> > >
> > > [2020-08-13T10:58:28.599] _slurm_rpc_submit_batch_job: JobId=62791
> > > InitPrio=4294845347 usec=679
> > >
> > > [2020-08-13T10:58:29.080] sched: Allocate JobId=62791 NodeList=
> > > X #CPUs=1 Partition=normal
> > >
> > > [2020-08-13T11:17:45.275] _job_complete: JobId=62791 WEXITSTATUS 137
> > >
> > > [2020-08-13T11:17:45.294] _job_complete: JobId=62791 done
> > >
> > >
> > >
> > >
> > >
> > > [root@ X]  /var/log/slurm# grep 62791 slurmd.log
> > >
> > > [2020-08-13T10:58:29.090] _run_prolog: prolog with lock for job 62791 ran
> > > for 0 seconds
> > >
> > > [2020-08-13T10:58:29.090] Launching batch job 62791 for UID 847694
> > >
> > > [2020-08-13T11:17:45.280] [62791.batch] sending
> > > REQUEST_COMPLETE_BATCH_SCRIPT, error:0 status 35072
> > >
> > > [2020-08-13T11:17:45.405] [62791.batch] done with job
> > >
> > >
> > >
> > >
> > >
> > > [root@X]  /var/log/slurm# sacct -j 62791
> > >
> > >JobIDJobName  PartitionAccount  AllocCPUS  State
> > > ExitCode
> > >
> > >  -- -- -- -- --
> > > 
> > >
> > > 62791nf-normal+ normal (null)  0 FAILED
> > > 9:0
> > >
> > >
> > >
> > > [root@X]  /var/log/slurm# sacct -lc | tail -n 100 | grep 62791
> > >
> > > JobIDUIDJobName  Partition   NNodesNodeList
> > > State   Start End  Timelimit
> > >
> > > 62791847694 nf-normal+ normal1 XXX.+
> > > FAILED 2020-08-13T10:58:29 2020-08-13T11:17:45  UNLIMITED
> > >
> > >
> > >
> > >
> > >
> > > Thank you!
> > >
> > >
> > >
> > > Anthony
> > >
> > >
> > >
> > >
> > > 
> > > *IMPORTANT* - PLEASE READ: This electronic message, including its
> > > attachments, is CONFIDENTIAL and may contain PROPRIETARY or LEGALLY
> > > PRIVILEGED or PROTECTED information and is intended for the authorized
&g

Re: [Beowulf] [External] SLURM - Where is this exit status coming from?

2020-08-13 Thread Skylar Thompson
I think this is an artifact of the job process running as a child process of
the job script, where POSIX defines the low-order 8 bits of the process
exit code as indicating which signal the child process received when it exited.

As others noted, 137 is 2^8+9, where 9 is SIGKILL (exceeding memory, also
exceeding the runtime request at least in the Grid Engine world).

On Thu, Aug 13, 2020 at 02:24:49PM -0700, Alex Chekholko via Beowulf wrote:
> This may be a "cargo cult" answer from old SGE days but IIRC "137" was
> "128+9" and it means the process got signal 9 which means _something_ sent
> it a SIGKILL.
> 
> On Thu, Aug 13, 2020 at 2:22 PM Prentice Bisbal via Beowulf <
> beowulf@beowulf.org> wrote:
> 
> > I think you dialed the wrong number. We're the Beowulf people! Although,
> > I'm sure we can still help you. ;)
> >
> > --
> > Prentice
> > On 8/13/20 4:14 PM, Altemara, Anthony wrote:
> >
> > Cheers SLURM people,
> >
> >
> >
> > We’re seeing some intermittent job failures in our SLURM cluster, all with
> > the same 137 exit code. I’m having difficulty in determining whether this
> > error code is coming from SLURM (timeout?) or the Linux OS (process killed,
> > maybe memory).
> >
> >
> >
> > In this example, there’s the WEXITSTATUS in the slurmctld.log, error:0
> > status 35072 in the slurd.log, and ExitCode 9:0 in the accounting log….???
> >
> >
> >
> > Does anyone have insight into  how all these correlate? I’ve spent a
> > significant amount of time digging  through the documentation, and I don’t
> > see a clear way on how to interpret all these…
> >
> >
> >
> >
> >
> > Example: Job: 62791
> >
> >
> >
> > [root@X]  /var/log/slurm# grep -ai jobid=62791 slurmctld.log
> >
> > [2020-08-13T10:58:28.599] _slurm_rpc_submit_batch_job: JobId=62791
> > InitPrio=4294845347 usec=679
> >
> > [2020-08-13T10:58:29.080] sched: Allocate JobId=62791 NodeList=
> > X #CPUs=1 Partition=normal
> >
> > [2020-08-13T11:17:45.275] _job_complete: JobId=62791 WEXITSTATUS 137
> >
> > [2020-08-13T11:17:45.294] _job_complete: JobId=62791 done
> >
> >
> >
> >
> >
> > [root@ X]  /var/log/slurm# grep 62791 slurmd.log
> >
> > [2020-08-13T10:58:29.090] _run_prolog: prolog with lock for job 62791 ran
> > for 0 seconds
> >
> > [2020-08-13T10:58:29.090] Launching batch job 62791 for UID 847694
> >
> > [2020-08-13T11:17:45.280] [62791.batch] sending
> > REQUEST_COMPLETE_BATCH_SCRIPT, error:0 status 35072
> >
> > [2020-08-13T11:17:45.405] [62791.batch] done with job
> >
> >
> >
> >
> >
> > [root@X]  /var/log/slurm# sacct -j 62791
> >
> >JobIDJobName  PartitionAccount  AllocCPUS  State
> > ExitCode
> >
> >  -- -- -- -- --
> > 
> >
> > 62791nf-normal+ normal (null)  0 FAILED
> > 9:0
> >
> >
> >
> > [root@X]  /var/log/slurm# sacct -lc | tail -n 100 | grep 62791
> >
> > JobIDUIDJobName  Partition   NNodesNodeList
> > State   Start End  Timelimit
> >
> > 62791847694 nf-normal+ normal1 XXX.+
> > FAILED 2020-08-13T10:58:29 2020-08-13T11:17:45  UNLIMITED
> >
> >
> >
> >
> >
> > Thank you!
> >
> >
> >
> > Anthony
> >
> >
> >
> >
> > 
> > *IMPORTANT* - PLEASE READ: This electronic message, including its
> > attachments, is CONFIDENTIAL and may contain PROPRIETARY or LEGALLY
> > PRIVILEGED or PROTECTED information and is intended for the authorized
> > recipient of the sender. If you are not the intended recipient, you are
> > hereby notified that any use, disclosure, copying, or distribution of this
> > message or any of the information included in it is unauthorized and
> > strictly prohibited. If you have received this message in error, please
> > immediately notify the sender by reply e-mail and permanently delete this
> > message and its attachments, along with any copies thereof, from all
> > locations received (e.g., computer, mobile device, etc.). To the extent
> > permitted by law, we may monitor electronic communications for the purposes
> > of ensuring compliance with our legal and regulatory obligations and
> > internal policies. We may also collect email traffic headers for analyzing
> > patterns of network traffic and managing client relationships. For further
> > information see: https://www.iqvia.com/about-us/privacy/privacy-policy.
> > Thank you.
> >
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit 
> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> >
> > --
> > Prentice Bisbal
> > Lead Software Engineer
> > Research Computing
> > Princeton Plasma Physics Laboratoryhttp://www.pppl.gov
> >
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin 

Re: [Beowulf] [EXTERNAL] Re: Have machine, will compute: ESXi or bare metal?

2020-02-19 Thread Skylar Thompson
Thanks for the pointers, Jim! I had this fear that there wasn't something I
was understanding with zeroconf addressing so it's good to know that other
people have had similar ideas. :)

With BCCD, we have a huge variety of NICs that we've supported over the
years, and even have had people run clusters over wifi (sometimes even
intentionally) though obviously a shared medium is bad for performance and 
latency.

On Wed, Feb 12, 2020 at 09:13:30PM +, Lux, Jim (US 337K) wrote:
> I've used zeroconf quite effectively on a couple clusters of beagleboards.  
> If you do it, then you can let the nodes use DHCP to get their IP addresses, 
> which is handy if you're sharing a WiFi network, for instance. 
> 
> On the other hand, for a "training experience", having to go through the 
> process of manually assigning IP addresses, and node identifiers (hostname) 
> and keeping them all straight is a useful thing.  Having done that, you 
> really appreciate zeroconf/bonjour and DHCP.
> 
> My two configurations are:
> Head node is a Macbook running OS X
> Configuration 1: Macbook using Wireless to connect to the "internet" and 
> wired to connect to the cluster, which is 4 BeagleBoard Green with wired 
> ethernet.
> Configuration 2: Macbook using wireless to connect to the "internet" and 
> wired to an 802.11b/g access point, then the cluster is 4 BeagleBoard Green 
> Wireless.
> 
> In both cases, I wind up setting up a bridge from the cluster to the outside 
> world via the macbook, so that I can run things like "apt get" on the nodes 
> to install software.
> I use a combination of screen, ssh, and pdsh to work with the nodes, and scp 
> (with pdsh) to move files around.
> 
> Note that with beagles (and Rpis) you usually use a "network over USB" to get 
> it up originally (the gadget interface)
> 
> On 2/11/20, 7:57 PM, "Beowulf on behalf of Skylar Thompson" 
>  wrote:
> 
> On Tue, Feb 11, 2020 at 06:25:24AM +0800, Benson Muite wrote:
> > 
> > 
> > On Tue, Feb 11, 2020, at 9:31 AM, Skylar Thompson wrote:
> > > On Sun, Feb 09, 2020 at 10:46:05PM -0800, Chris Samuel wrote:
> > > > On 9/2/20 10:36 pm, Benson Muite wrote:
> > > > 
> > > > > Take a look at the bootable cluster CD here:
> > > > > http://www.littlefe.net/
> > > > 
> > > > From what I can see BCCD hasn't been updated for just over 5 years, 
> and the
> > > > last email on their developer list was Feb 2018, so it's likely a 
> little out
> > > > of date now.
> > > > 
> > > > http://bccd.net/downloads
> > > > 
> > > > http://bccd.net/pipermail/bccd-developers/
> > > > 
> > > > On the other hand their TRAC does list some ticket updates a few 
> months ago,
> > > > so perhaps there are things going on but Skylar needs more hands?
> > > > 
> > > > 
> https://cluster.earlham.edu/trac/bccd-ng/report/1?sort=created=0=1
> > > 
> > > Wow, I had no idea people on the Beowulf list were still thinking of 
> BCCD.
> > > :)
> > > 
> > > I've been working on a major BCCD update for a while now (modern 
> Debian,
> > > better node auto-detection) but a combination of life interference 
> and a
> > > shift in focus for the project to curriculum development has slowed me
> > > down.
> > > 
> > > At the end of the day, BCCD has three has three main goals:
> > > 
> > > 1. Non-destructive in its default mode
> > > 2. Simple ("just press enter")
> > > 3. Ready with pedagogically-useful ("validated, verified, and 
> accredited")
> > > curriculum modules
> > > 
> > > One thing I'm hoping can come out of this major update is to decouple 
> the BCCD
> > > from the underlying distribution, since that's been a barrier for some
> > > people in using BCCD. That's just an aspiration right now, but we'll 
> see
> > > where it goes.
> > 
> > Can you give more details on how you expect to decouple it?
> 
> There's a couple things that are tightly integrated with the init process
> now:
> 
> * Network setup - prompts the user for information (which NIC to run on,
>   whether to use DHCP to assign addresses, etc.), detects other BCCD
>   systems on the network, etc.
> 
> * SSH public key broadcast (pkbcast) - needs to run a

Re: [Beowulf] HPC for community college?

2020-02-19 Thread Skylar Thompson
This was basically our intent with the LittleFe project[1] as well: a lot
of small institutions don't have the facilities or the expertise to run a
supercomputer, but they have students and faculty that would really benefit from
being able to learn the basics of parallel computing and HPC if only there
weren't cost and technical barriers in their way. The need is even present
at the high school level, since I know there's a handful of LittleFe units
that ended up at them through our build-out program. XSEDE is an option,
but especially for younger students having something they can touch and
"own" is a valuable motivator.

One challenge is that it's hard incorporating parallel/HPC into course
materials, but pedagogical resources like CSERD and JOCSE (both Shodor
Foundation projects) can be helpful.

[1] http://littlefe.net/
[2] http://www.shodor.org/refdesk/
[3] http://jocse.org/

On Thu, Feb 20, 2020 at 12:23:22AM +, Chuck Petras wrote:
>You could make it a really personal experience.
> 
> 
>This looks like a fun board to cluster a bunch of Raspberry Pi's
>https://turingpi.com/ or
>https://www.mininodes.com/product/5-node-raspberry-pi-3-com-carrier-boa
>rd/
> 
> 
>That way the students could take their projects home with them. In
>class they could cluster multiple of these carriers together.
> 
> 
> 
> 
>Chuck Petras, PE**
> 
>Schweitzer Engineering Laboratories, Inc
> 
>Pullman, WA  99163  USA
> 
>http://www.selinc.com
> 
> 
>SEL Synchrophasors - A New View of the Power System
>
> 
> 
>Making Electric Power Safer, More Reliable, and More Economical (R)
> 
> 
>** Registered in Oregon.

> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Have machine, will compute: ESXi or bare metal?

2020-02-11 Thread Skylar Thompson
On Tue, Feb 11, 2020 at 06:25:24AM +0800, Benson Muite wrote:
> 
> 
> On Tue, Feb 11, 2020, at 9:31 AM, Skylar Thompson wrote:
> > On Sun, Feb 09, 2020 at 10:46:05PM -0800, Chris Samuel wrote:
> > > On 9/2/20 10:36 pm, Benson Muite wrote:
> > > 
> > > > Take a look at the bootable cluster CD here:
> > > > http://www.littlefe.net/
> > > 
> > > From what I can see BCCD hasn't been updated for just over 5 years, and 
> > > the
> > > last email on their developer list was Feb 2018, so it's likely a little 
> > > out
> > > of date now.
> > > 
> > > http://bccd.net/downloads
> > > 
> > > http://bccd.net/pipermail/bccd-developers/
> > > 
> > > On the other hand their TRAC does list some ticket updates a few months 
> > > ago,
> > > so perhaps there are things going on but Skylar needs more hands?
> > > 
> > > https://cluster.earlham.edu/trac/bccd-ng/report/1?sort=created=0=1
> > 
> > Wow, I had no idea people on the Beowulf list were still thinking of BCCD.
> > :)
> > 
> > I've been working on a major BCCD update for a while now (modern Debian,
> > better node auto-detection) but a combination of life interference and a
> > shift in focus for the project to curriculum development has slowed me
> > down.
> > 
> > At the end of the day, BCCD has three has three main goals:
> > 
> > 1. Non-destructive in its default mode
> > 2. Simple ("just press enter")
> > 3. Ready with pedagogically-useful ("validated, verified, and accredited")
> > curriculum modules
> > 
> > One thing I'm hoping can come out of this major update is to decouple the 
> > BCCD
> > from the underlying distribution, since that's been a barrier for some
> > people in using BCCD. That's just an aspiration right now, but we'll see
> > where it goes.
> 
> Can you give more details on how you expect to decouple it?

There's a couple things that are tightly integrated with the init process
now:

* Network setup - prompts the user for information (which NIC to run on,
  whether to use DHCP to assign addresses, etc.), detects other BCCD
  systems on the network, etc.

* SSH public key broadcast (pkbcast) - needs to run after a user logs in to
  ensure that authorized_keys is setup on other participating systems.

The network setup in particular is a challenge in the systemd world, since
getting STDIN from systemd-invoked processes is not trivial. We've also had
some users wanting better integration between networking and desktop
applications, which pushed me to try to make use of the existing
systemd/networkd toolchain rather than rolling our own tooling.

Right now the challenge is the node auto-detection, though I'm hoping that
we might be able to use mDNS and zeroconf-assigned addressing rather than
depending on custom DHCP tags which have been problematic in the past.
zeroconf might also mitigate the biggest problem we have in workshops:
someone jumping the gun and starting up a head node before we're ready to
go.

While I'm not the biggest fan of systemd, it does have the potential to
allow us to get away from custom scripts and use functionality common
across more than one distribution.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Have machine, will compute: ESXi or bare metal?

2020-02-10 Thread Skylar Thompson
On Sun, Feb 09, 2020 at 10:46:05PM -0800, Chris Samuel wrote:
> On 9/2/20 10:36 pm, Benson Muite wrote:
> 
> > Take a look at the bootable cluster CD here:
> > http://www.littlefe.net/
> 
> From what I can see BCCD hasn't been updated for just over 5 years, and the
> last email on their developer list was Feb 2018, so it's likely a little out
> of date now.
> 
> http://bccd.net/downloads
> 
> http://bccd.net/pipermail/bccd-developers/
> 
> On the other hand their TRAC does list some ticket updates a few months ago,
> so perhaps there are things going on but Skylar needs more hands?
> 
> https://cluster.earlham.edu/trac/bccd-ng/report/1?sort=created=0=1

Wow, I had no idea people on the Beowulf list were still thinking of BCCD.
:)

I've been working on a major BCCD update for a while now (modern Debian,
better node auto-detection) but a combination of life interference and a
shift in focus for the project to curriculum development has slowed me
down.

At the end of the day, BCCD has three has three main goals:

1. Non-destructive in its default mode
2. Simple ("just press enter")
3. Ready with pedagogically-useful ("validated, verified, and accredited")
curriculum modules

One thing I'm hoping can come out of this major update is to decouple the BCCD
from the underlying distribution, since that's been a barrier for some
people in using BCCD. That's just an aspiration right now, but we'll see
where it goes.

I'll get off my soapbox and get back to work. :)

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Interactive vs batch, and schedulers [EXT]

2020-01-17 Thread Skylar Thompson
In the Grid Engine world, we've worked around some of the resource
fragmentation issues by assigning static sequence numbers to queue
instances (a node publishing resources to a queue) and then having the
scheduler fill nodes by sequence number rather than spreading jobs across
the cluster. This leaves some nodes free of jobs unless a really big job
comes in that requires entire nodes.

Since we're a bioinformatics shop, most of our jobs aren't parallel, though
a few job types require lots of memory (we have a handful of nodes in the
1TB-4TB RAM range). Grid Engine lets us isolate jobs from each other using
cgroups, where a job resource request is translated directly to the
resource (memory, CPU, etc.) limits of a cgroup.

On Fri, Jan 17, 2020 at 08:44:14AM +, Tim Cutts wrote:
>Indeed, and you can quite easily get into a “boulders and sand”
>scheduling problem; if you allow the small interactive jobs (the sand)
>free access to everything, the scheduler tends to find them easy to
>schedule, partially fills nodes with them, and then finds it can’t find
>contiguous resources large enough for the big parallel jobs (the
>boulders), and you end up with the large batch jobs pending forever.
> 
>I’ve tried various approaches to this in the past; for example
>pre-emption of large long running jobs, but that causes resource
>starvation (suspended jobs are still consuming virtual memory) and then
>all sorts of issues with timeouts on TCP connections and so on and so
>forth, these being genomics jobs with lots of not-normal-HPC activities
>like talking to relational databases etc.
> 
>I think you always end up having to ring-fence hardware for the large
>parallel batch jobs, and not allow the interactive stuff on it.
> 
>This of course is what leads some users to favour the cloud, because it
>appears to be infinite, and so the problem appears to go away.  But
>let's not get into that argument here.
> 
>Tim
> 
>On 16 Jan 2020, at 23:50, Alex Chekholko via Beowulf
><[1]beowulf@beowulf.org> wrote:
> 
>Hey Jim,
>There is an inverse relationship between latency and throughput.  Most
>supercomputing centers aim to keep their overall utilization high, so
>the queue always needs to be full of jobs.
>If you can have 1000 nodes always idle and available, then your 1000
>node jobs will usually take 10 seconds.  But your overall utilization
>will be in the low single digit percent or worse.
>Regards,
>Alex
>On Thu, Jan 16, 2020 at 3:25 PM Lux, Jim (US 337K) via Beowulf
><[2]beowulf@beowulf.org> wrote:
> 
>Are there any references out there that discuss the tradeoffs between
>interactive and batch scheduling (perhaps some from the 60s and 70s?) –
> 
>Most big HPC systems have a mix of giant jobs and smaller ones managed
>by some process like PBS or SLURM, with queues of various sized jobs.
> 
> 
>What I’m interested in is the idea of jobs that, if spread across many
>nodes (dozens) can complete in seconds (<1 minute) providing
>essentially “interactive” access, in the context of large jobs taking
>days to complete.   It’s not clear to me that the current schedulers
>can actually do this – rather, they allocate M of N nodes to a
>particular job pulled out of a series of queues, and that job “owns”
>the nodes until it completes.  Smaller jobs get run on (M-1) of the N
>nodes, and presumably complete faster, so it works down through the
>queue quicker, but ultimately, if you have a job that would take, say,
>10 seconds on 1000 nodes, it’s going to take 20 minutes on 10 nodes.
> 
> 
>Jim
> 
> 
> 
>--
> 
> 
>  ___
>  Beowulf mailing list, [3]Beowulf@beowulf.org sponsored by Penguin
>  Computing
>  To change your subscription (digest mode or unsubscribe) visit
>  [4]https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>  [beowulf.org]
> 
>___
>Beowulf mailing list, [5]Beowulf@beowulf.org sponsored by Penguin
>Computing
>To change your subscription (digest mode or unsubscribe) visit
>[6]https://urldefense.proofpoint.com/v2/url?u=https-3A__beowulf.org_cgi
>-2Dbin_mailman_listinfo_beowulf=DwIGaQ=D7ByGjS34AllFgecYw0iC6Zq7qlm
>8uclZFI0SqQnqBo=gSesY1AbeTURZwExR_OGFZlp9YUzrLWyYpGmwAw4Q50=xK7X4jU
>X3oG8IizF_lTh0GNrYM4sF9nUCxNKq6vi97c=rnNXVoLqTeEFVWB-0Jr0hJC0BgpH2_jm
>2s51IZb0H8o=
> 
>-- The Wellcome Sanger Institute is operated by Genome Research
>Limited, a charity registered in England with number 1021457 and a
>company registered in England with number 2742969, whose registered
>office is 215 Euston Road, London, NW1 2BE.
> 
> References
> 
>1. mailto:beowulf@beowulf.org
>2. mailto:beowulf@beowulf.org
>3. mailto:Beowulf@beowulf.org
>4. 
> 

Re: [Beowulf] software for activating one of many programs but not the others?

2019-08-20 Thread Skylar Thompson
We also use Environment Modules, with a well-established hierarchy for
software installs
(software-name/software-version/OS/OS-version/architecture). Combined with
some custom Tcl functions and common header files for our module files,
this lets us keep the size of most module files very small (2-5 lines).

If we were to do it again today, maybe we'd use Lmod, but Modules is
functional and has a lot of inertia.

On Tue, Aug 20, 2019 at 06:50:31PM +, Ryan Novosielski wrote:
> Really sounds like you should be using environment modules. What I’d 
> recommend to anyone starting out today would be Lmod: 
> https://lmod.readthedocs.io/en/latest/
> 
> Most of the software building/installation packages interface with it.
> 
> Generally the software installs are done into a place that’s unique for each 
> package and version, and maybe even for what compiler it was built with (see 
> hierarchical).
> 
> --
> 
> || \\UTGERS,   
> |---*O*---
> ||_// the State| Ryan Novosielski - novos...@rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\of NJ| Office of Advanced Research Computing - MSB C630, 
> Newark
>  `'
> 
> > On Aug 20, 2019, at 1:11 PM, David Mathog  wrote:
> > 
> > On a system I am setting up there are a very large number of different 
> > software packages available.  The sources live in /usr/local/src and a 
> > small number of the most commonly used ones are installed in 
> > /usr/local/bin, /usr/local/lib and so forth.  The issue is that any of the 
> > target end users will only want a couple of these.  If they were all fully 
> > installed into /usr/local there would be some name conflicts.  They may 
> > also be bringing some of their own versions of these, and while $PATH order 
> > can help there, it would be best to avoid those possible conflicts too.  
> > Users don't have priv's to modify /usr/local, so they cannot 
> > install/uninstall there themselves.
> > 
> > So I'm looking for something like
> > 
> >  setup software_name install
> >  setup software_name remove
> > 
> > which would install/uninstall the packages (perhaps by symlinks) from
> > 
> >  /usr/local/src/software_name
> > 
> > under the user's home directory.  The goal is that the setup scripts NOT be 
> > constructed by hand.  It would have a
> > 
> >  setup software_name install
> > 
> > which would emulate a:
> > 
> >  make install
> > 
> > and automatically translate it into the appropriate setup commands.  Some 
> > of these packages have hundreds of programs, so anything manual is going to 
> > be very
> > painful.
> > 
> > Anybody seen a piece of software like this?
> > 
> > I don't expect this to work in all cases.  Some of these packages hard code 
> > paths into the binaries and/or scripts.  The only hope for them is for the 
> > user to do some variant of:
> > 
> >cd $HOMEDIR
> >(cd /usr/local/src; tar -cf - software_name) | tar -xf -
> >cd software_name
> >make clean  #pray that it gets everything!!!
> >./configure --prefix=$HOMEDIR
> >make
> >make install
> > 
> > There is a file which documents how to build each package, although it is 
> > nowhere near complete at this time.
> > 
> > Docker is already available if the user wants to go that route, which 
> > avoids this whole issue, but at the cost of moving big images around.
> > 
> > Thanks,
> > 
> > David Mathog
> > mat...@caltech.edu
> > Manager, Sequence Analysis Facility, Biology Division, Caltech
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit 
> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] GPFS question

2019-04-30 Thread Skylar Thompson
I'm glad to hear you got it working again! Out of curiousity, how many
files and bytes do you have spread across how many NSDs? It's been nearly a
decade since we've had to run mmfsck but the run-time of it is a concern
now that our storage is an order of magnitude bigger than it was then.

Also, if you aren't already, you might subscribe to the GPFS user group
mailing list, which has good GPFS technical discussions and also some
IBM/GPFS developers who post:

http://gpfsug.org/mailman/listinfo/gpfsug-discuss

On Tue, Apr 30, 2019 at 09:38:11PM +0100, Jörg Saßmannshausen wrote:
> Dear all,
> 
> many thanks for all the emails, both on- and offline, which was a great help 
> for 
> me to get some idea how long that might take. 
> 
> All the best from a sunny London!
> 
> Jörg
> 
> Am Dienstag, 30. April 2019, 07:58:20 BST schrieb John Hearns via Beowulf:
> > Hi Jorg. I will mail you offline.
> > IBM support for GPFS is excellent - so if they advise a check like that it
> > is needed.
> > 
> > On Tue, 30 Apr 2019 at 04:53, Chris Samuel  wrote:
> > > On Monday, 29 April 2019 3:47:10 PM PDT Jörg Saßmannshausen wrote:
> > > > thanks for the feedback. I guess it also depends how much meta-data you
> > > 
> > > have
> > > 
> > > > and whether or not you have zillions of small or larger files.
> > > > At least I got an idea how long it might take.
> > > 
> > > This thread might also be useful, it is a number of years old but it does
> > > have some
> > > advice on placement of the filesystem manager before the scan and also on
> > > their
> > > experience scanning a ~1PB filesystem.
> > > 
> > > 
> > > https://www.ibm.com/developerworks/community/forums/html/topic?id=
> > > ----14834266
> > > 
> > > All the best,
> > > Chris
> > > --
> > > 
> > >   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
> > > 
> > > ___
> > > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> > > To change your subscription (digest mode or unsubscribe) visit
> > > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] live free SGE descendent (for Centos 7)?

2019-03-05 Thread Skylar Thompson
Hi David,

Not sure if you saw this, but Univa just announced that they will be
selling support for the open source GE forks:

http://www.univa.com/about/news/press_2019/02282019.php

I don't know how much development time they will include, but as you note,
at some point the open source forks will need updating or will just become
defunct.

Skylar (a happy UGE customer)


On Tue, Mar 5, 2019, 10:39 David Mathog  wrote:

> Are any of the free SGE derived projects still alive?  If so, buildable
> on Centos 7?
>
> Son of grid engine, for instance, has not had a release since 2016
>
> https://arc.liv.ac.uk/downloads/SGE/releases/8.1.9/
>
> and there isn't one for Centos 7.  The last rocks release still has an
> SGE
> option, and that is based on Centos 7, so some version can be built on
> that platform.  Anybody know off hand which one they used?
>
> The Univa version still seems to be kicking, but that is commercial.
>
> I have an old version running on one Centos 7 machine, but it was not
> built there.  It is a 32 binary made long ago (Mandriva 2010 or Mageia
> 3?)  and still uses
>
> /etc/rc.d/init.d/sgemaster
>
> to start/stop rather than a systemd method.
>
> Thanks,
>
> David Mathog
> mat...@caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] HPC sysadmin position at University of Washington Genome Sciences

2019-01-07 Thread Skylar Thompson
Hi Beowulfers,

UW Genome Sciences is hiring for a HPC sysadmin to support our
bioinformatics and general research computing for the department:

https://uwhires.admin.washington.edu/eng/candidates/default.cfm?szCategory=jobprofile=163509=0=linux=1

We're a small group (three engineers and a manager) who support ~650
cluster nodes, ~100 servers, 9PB of disk storage (standardizing on IBM
GPFS/DDN, from Isilon), and 16PB of tape storage (IBM TSM/Oracle STK).

Definitely feel free to email me off-list if you have questions.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Poll - Directory implementation

2018-10-27 Thread Skylar Thompson
Yup, indeed, nslcd is slow enough on its own that nscd helps despite its
flaws...

At this point, the only reason we haven't made the switch to sssd is that
nslcd works just well enough not to become a project to replace it.

On Sat, Oct 27, 2018 at 08:08:32AM +0100, John Hearns via Beowulf wrote:
>To be clear I am talking about the Name Service Cacheing Daemon
>I have always found this to be more trouble than it is worth  - it
>holds on to out of date information,
>and needs to be restarted when you are debugging things like batch
>systems etc.
>nslcd is something completely different (*)   and whoever chose similar
>names should be forced to watch endless re-runs of the Parrot Sketch.
>[1]https://wiki.samba.org/index.php/Nslcd
>(*) obligatory Python reference
> 
>    On Sat, 27 Oct 2018 at 04:12, Skylar Thompson
><[2]skylar.thomp...@gmail.com> wrote:
> 
>  Good to know - we're still nslcd users so have yet to run into that,
>  though
>  are about to make the leap to CentOS 7 where I think we will have to
>  use
>  it.
>  On Sat, Oct 27, 2018 at 03:13:47AM +0100, John Hearns via Beowulf
>  wrote:
>  >Skylar, I believe that nscd does not work well with sssd and I
>  disabled
>  >it.
>  >See
>  [1][3]https://access.redhat.com/documentation/en-us/red_hat_enterpri
>  se
>  >_linux/6/html/deployment_guide/usingnscd-sssd
>  >I believe that nscd is the work of Auld Nick himself and causes
>  more
>  >problems than it is worth on HPC nodes.
>  >If you want to speed up cacheing with sssd itself you can put
>  its local
>  >caches on a RAMdisk. This has the cost of no persistence of
>      course and
>  >uses up RAM which you may prefer to put to better use.
>  >
>  >On Sat, 27 Oct 2018 at 00:59, Skylar Thompson
>  ><[2][4]skylar.thomp...@gmail.com> wrote:
>  >
>  >  On Fri, Oct 26, 2018 at 08:44:28PM +, Ryan Novosielski
>  wrote:
>  >  > Our LDAP is very small, compared to the sorts of things
>  some
>  >  people run.
>  >  >
>  >  > We added indexes today on uid, uidNumber, and gidNumber and
>  the
>  >  problem went away. Didn’t try it earlier as it had virtually
>  no
>  >  impact on our testing system for whatever reason, but on a
>  different
>  >  testing system and on production, it dropped “ls -al /home/“
>  from
>  >  ~90s to ~5s. I’m not sure if all three were necessary, but
>  I’ll look
>  >  back at that later.
>  >  >
>  >  > We’ve run SSSD from day one, so that eliminates the nscld
>  >  question. We also moved CentOS 5.x to SSSD, FYI (I believe
>  there was
>  >  someone else with some old systems around). Was pretty
>  painless, and
>  >  SSSD eliminates a lot of problems that exist with the older
>  stuff
>  >  (including some really boneheaded very large LDAP queries
>  that were
>  >  happening routinely with the older nss-ldap software if I’m
>  >  remembering its name correctly).
>  >  Have you experimented with client-side caching services like
>  nscd?
>  >  nscd has
>  >  its quirks (in particular, it does very poorly with caching
>  spurious
>  >  negative
>  >  results from transient network failures), but it also is a
>  big
>  >  performance
>  >  improvement since you don't even have to hit the network or
>  the
>  >  directory
>  >  services.
>  >  --
>  >  Skylar
>  >  ___
>  >  Beowulf mailing list, [3][5]Beowulf@beowulf.org sponsored by
>  Penguin
>  >  Computing
>  >  To change your subscription (digest mode or unsubscribe)
>  visit
>  >  [4][6]http://www.beowulf.org/mailman/listinfo/beowulf
>  >
>  > References
>  >
>  >1.
>  [7]https://access.redhat.com/documentation/en-us/red_hat_enterprise_
>  linux/6/html/deployment_guide/usingnscd-sssd
>  >2. mailto:[8]skylar.thomp...@gmail.com
>  >3. mailto:[9]Beowulf@beowulf.org
>  >4. [10]http://www.beowulf.org/mailman/listinfo/beowulf
>  > ___
>  > Beowulf mailing list, [11]Beowu

Re: [Beowulf] Poll - Directory implementation

2018-10-26 Thread Skylar Thompson
Good to know - we're still nslcd users so have yet to run into that, though
are about to make the leap to CentOS 7 where I think we will have to use
it.

On Sat, Oct 27, 2018 at 03:13:47AM +0100, John Hearns via Beowulf wrote:
>Skylar, I believe that nscd does not work well with sssd and I disabled
>it.
>See [1]https://access.redhat.com/documentation/en-us/red_hat_enterprise
>_linux/6/html/deployment_guide/usingnscd-sssd
>I believe that nscd is the work of Auld Nick himself and causes more
>problems than it is worth on HPC nodes.
>If you want to speed up cacheing with sssd itself you can put its local
>caches on a RAMdisk. This has the cost of no persistence of course and
>uses up RAM which you may prefer to put to better use.
> 
>    On Sat, 27 Oct 2018 at 00:59, Skylar Thompson
><[2]skylar.thomp...@gmail.com> wrote:
> 
>  On Fri, Oct 26, 2018 at 08:44:28PM +, Ryan Novosielski wrote:
>  > Our LDAP is very small, compared to the sorts of things some
>  people run.
>  >
>  > We added indexes today on uid, uidNumber, and gidNumber and the
>  problem went away. Didn’t try it earlier as it had virtually no
>  impact on our testing system for whatever reason, but on a different
>  testing system and on production, it dropped “ls -al /home/“ from
>  ~90s to ~5s. I’m not sure if all three were necessary, but I’ll look
>  back at that later.
>  >
>  > We’ve run SSSD from day one, so that eliminates the nscld
>  question. We also moved CentOS 5.x to SSSD, FYI (I believe there was
>  someone else with some old systems around). Was pretty painless, and
>  SSSD eliminates a lot of problems that exist with the older stuff
>  (including some really boneheaded very large LDAP queries that were
>  happening routinely with the older nss-ldap software if I’m
>  remembering its name correctly).
>  Have you experimented with client-side caching services like nscd?
>  nscd has
>  its quirks (in particular, it does very poorly with caching spurious
>  negative
>  results from transient network failures), but it also is a big
>  performance
>  improvement since you don't even have to hit the network or the
>  directory
>  services.
>  --
>  Skylar
>  ___
>  Beowulf mailing list, [3]Beowulf@beowulf.org sponsored by Penguin
>  Computing
>  To change your subscription (digest mode or unsubscribe) visit
>  [4]http://www.beowulf.org/mailman/listinfo/beowulf
> 
> References
> 
>1. 
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/usingnscd-sssd
>2. mailto:skylar.thomp...@gmail.com
>3. mailto:Beowulf@beowulf.org
>4. http://www.beowulf.org/mailman/listinfo/beowulf

> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Poll - Directory implementation

2018-10-26 Thread Skylar Thompson
On Fri, Oct 26, 2018 at 08:44:28PM +, Ryan Novosielski wrote:
> Our LDAP is very small, compared to the sorts of things some people run.
> 
> We added indexes today on uid, uidNumber, and gidNumber and the problem went 
> away. Didn’t try it earlier as it had virtually no impact on our testing 
> system for whatever reason, but on a different testing system and on 
> production, it dropped “ls -al /home/“ from ~90s to ~5s. I’m not sure if all 
> three were necessary, but I’ll look back at that later.
> 
> We’ve run SSSD from day one, so that eliminates the nscld question. We also 
> moved CentOS 5.x to SSSD, FYI (I believe there was someone else with some old 
> systems around). Was pretty painless, and SSSD eliminates a lot of problems 
> that exist with the older stuff (including some really boneheaded very large 
> LDAP queries that were happening routinely with the older nss-ldap software 
> if I’m remembering its name correctly).

Have you experimented with client-side caching services like nscd? nscd has
its quirks (in particular, it does very poorly with caching spurious negative
results from transient network failures), but it also is a big performance
improvement since you don't even have to hit the network or the directory
services.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Poll - Directory implementation

2018-10-25 Thread Skylar Thompson
At Univ. of WA Genome Sciences, we use Active Directory, but we also
support a modest desktop environment. As much as I am not a fan of
Microsoft, AD just works (even the replication) and, since someone else is
responsible for the Windows gear here, I can just think of it as a
LDAP/Krb5 store with a few minor extensions.

On Wed, Oct 24, 2018 at 11:29:39AM -0500, Tom Harvill wrote:
> 
> Hello,
> 
> Long time lurker, very infrequent poster - I enjoy this list very much.
> 
> We run multiple clusters in different data centers with a single directory
> (LDAP) for general authentication and some user grouping for special
> purposes (eg delineating admin users for privileges). We put 'extra' user
> data in an RDBMS.
> 
> We currently use 389-DS (aka Fedora Directory Server) and there is some
> internal pressure to switch to OpenLDAP.
> 
> 389-DS is working well, we use the multi-master feature.  It really hasn't
> failed us.
> 
> I'm writing this list to ask:
> 
> - what directory solution do you implement?
> - if LDAP, which flavor?
> - do you have any opinions one way or another on the topic?
> 
> Because 389-DS has just worked, it's sort-of out of sight and mind. I've
> been re-engaging it for a little while and from what I can see it's fairly
> well documented (I don't remember this being the case when we originally set
> it up 10+ years ago.)  I think OpenLDAP doesn't have integrated multi-master
> replication - that feature appears to be a bolted on script.
> 
> Thanks in advance for your time,
> 
> Tom
> 
> Tom Harvill
> Holland Computing Center
> https://hcc.unl.edu
> 
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Contents of Compute Nodes Images vs. Login Node Images

2018-10-23 Thread Skylar Thompson
At Univ. of WA Genome Sciences, we run the same build on both login and
compute nodes. The login nodes are obviously not as capable as our compute
nodes, but it's easier for us to provision them in the same way.

On Tue, Oct 23, 2018 at 04:15:51PM +, Ryan Novosielski wrote:
> Hi there,
> 
> I realize this may not apply to all cluster setups, but I’m curious what 
> other sites do with regard to software (specifically distribution packages, 
> not a shared software tree that might be remote mounted) for their login 
> nodes vs. their compute nodes. From what I knew/conventional wisdom, sites 
> generally place pared down node images on compute nodes, only containing the 
> runtime. I’m curious to see if that’s still true, or if there are people 
> doing something else entirely, etc.
> 
> Thanks.
> 
> --
> 
> || \\UTGERS,   
> |---*O*---
> ||_// the State| Ryan Novosielski - novos...@rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\of NJ| Office of Advanced Research Computing - MSB C630, 
> Newark
>  `'
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] memory usage

2018-06-22 Thread Skylar Thompson
On Friday, June 22, 2018, Michael Di Domenico 
wrote:

> On Fri, Jun 22, 2018 at 2:28 PM, Skylar Thompson
>  wrote:
> > Assuming Linux, you can get that information out of /proc//smaps and
> > numa_maps.
>
> the memory regions are in there for the used bits, but i don't have
> anything that translates those regions to which cpu the region sits on
>

I think the number before the = in the page count fields (Nn=pages) is the
NUMA node number. Not exactly the CPU, but the memory isn't allocated to a
specific CPU (modulo CPU cache).

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] memory usage

2018-06-22 Thread Skylar Thompson
Assuming Linux, you can get that information out of /proc//smaps and
numa_maps.

Skylar

On Friday, June 22, 2018, Michael Di Domenico 
wrote:

> does anyone know of a tool that looks at a process
> (single/multi-threaded) and tells you how much memory it's using and
> in which numa domain the allocated memory is sitting.
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Skylar Thompson
On Tue, Jun 12, 2018 at 11:08:44AM -0400, Prentice Bisbal wrote:
> On 06/12/2018 12:33 AM, Chris Samuel wrote:
> 
> >Hi Prentice!
> >
> >On Tuesday, 12 June 2018 4:11:55 AM AEST Prentice Bisbal wrote:
> >
> >>I to make this work, I will be using job_submit.lua to apply this logic
> >>and assign a job to a partition. If a user requests a specific partition
> >>not in line with these specifications, job_submit.lua will reassign the
> >>job to the appropriate QOS.
> >Yeah, that's very much like what we do for GPU jobs (redirect them to the
> >partition with access to all cores, and ensure non-GPU jobs go to the
> >partition with fewer cores) via the submit filter at present..
> >
> >I've already coded up something similar in Lua for our submit filter (that 
> >only
> >affects my jobs for testing purposes) but I still need to handle memory
> >correctly, in other words only pack jobs when the per-task memory request *
> >tasks per node < node RAM (for now we'll let jobs where that's not the case 
> >go
> >through to the keeper for Slurm to handle as now).
> >
> >However, I do think Scott's approach is potentially very useful, by directing
> >jobs < full node to one end of a list of nodes and jobs that want full nodes
> >to the other end of the list (especially if you use the partition idea to
> >ensure that not all nodes are accessible to small jobs).
> >
> This was something that was very easy to do with SGE. It's been a while
> since I worked with SGE so I forget all the details, but in essence, you
> could assign nodes a 'serial number' which would specify the preferred order
> in which nodes would be assigned to jobs, and I believe that order was
> specific to each queue, so if you had 64 nodes, one queue could assign jobs
> starting at node 1 and work it's way up to node 64, while another queue
> could start at node 64 and work its way down to node 1. This technique was
> mentioned in the SGE documentation to allow MPI and shared memory jobs to
> share the cluster.
> 
> At the time, I used it, for exactly that purpose, but I didn't think it was
> that big a deal. Now that I don't have that capability, I miss it.

Yep, this is still the case. It's not actually a setting of the exec host,
but of each queue instance that the exec host is providing. By default GE
sorts queue instances by load but you can set sequence number in the
scheduler configuration. Unfortunately, this is a cluster-wide setting, so
you can't have some queues sorted by load and others sorted by sequence
number.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Skylar Thompson
On Tue, Jun 12, 2018 at 10:06:06AM +0200, John Hearns via Beowulf wrote:
> What do most sites do for scratch space?

We give users access to local disk space on nodes (spinning disk for older
nodes, SSD for newer nodes), which (for the most part) GE will address with
the $TMPDIR job environment variable. We have a "ssd" boolean complex that
users can place in their job to request SSD nodes if they know they will
benefit from them.

We also have labs that use non-backed up portions of their network storage
(Isilon for the older storage, DDN/GPFS for the newer) for scratch space
for processing of pipeline data, where different stages of the pipeline run
on different nodes.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-12 Thread Skylar Thompson
On Tue, Jun 12, 2018 at 02:28:25PM +1000, Chris Samuel wrote:
> On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote:
> 
> > Unfortunately we don't have a mechanism to limit
> > network usage or local scratch usage
> 
> Our trick in Slurm is to use the slurmdprolog script to set an XFS project
> quota for that job ID on the per-job directory (created by a plugin which
> also makes subdirectories there that it maps to /tmp and /var/tmp for the
> job) on the XFS partition used for local scratch on the node.
> 
> If they don't request an amount via the --tmp= option then they get a default
> of 100MB.  Snipping the relevant segments out of our prolog...
> 
> JOBSCRATCH=/jobfs/local/slurm/${SLURM_JOB_ID}.${SLURM_RESTART_COUNT}
> 
> if [ -d ${JOBSCRATCH} ]; then
> QUOTA=$(/apps/slurm/latest/bin/scontrol show JobId=${SLURM_JOB_ID} | 
> egrep MinTmpDiskNode=[0-9] | awk -F= '{print $NF}')
> if [ "${QUOTA}" == "0" ]; then
> QUOTA=100M
> fi
> /usr/sbin/xfs_quota -x -c "project -s -p ${JOBSCRATCH} 
> ${SLURM_JOB_ID}" /jobfs/local
> /usr/sbin/xfs_quota -x -c "limit -p bhard=${QUOTA} ${SLURM_JOB_ID}" 
> /jobfs/local

Thanks, Chris! We've been considering doing this with GE prolog/epilog
scripts (and boot-time logic to clean up if a node dies with scratch space
still allocated) but haven't gotten around to it. I think we might also
need to get buy-in from some groups that are happy with the unenforced
state right now.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-11 Thread Skylar Thompson
On Mon, Jun 11, 2018 at 02:36:14PM +0200, John Hearns via Beowulf wrote:
> Skylar Thomson wrote:
> >Unfortunately we don't have a mechanism to limit
> >network usage or local scratch usage, but the former is becoming less of a
> >problem with faster edge networking, and we have an opt-in bookkeeping
> mechanism
> >for the latter that isn't enforced but works well enough to keep people
> happy.
> That is interesting to me. At ASML I worked on setting up Quality of
> Service, ie bandwidth limits, for GPFS storage and MPI traffic.
> GPFS does have QoS limits inbuilt, but these are intended to limit the
> backgrouns housekeeping tasks rather than to limit user processes.
> But it does have the concept.
> With MPI you can configure different QoS levels for different traffic.
> 
> More relevently I did have a close discussion with Parav Pandit who is
> working on the network QoS stuff.
> I am sure there is something more up to date than this
> https://www.openfabrics.org/images/eventpresos/2016presentations/115rdmacont.pdf
> Sadly this RDMA stuff needs a recent 4-series kernel. I guess the
> discussion on whether or not you should go with a bleeding edge kernel is
> for another time!
> But yes cgroups have configurable network limits with the latest kernels.
> 
> Also being cheeky, and I probably have mentioned them before, here is a
> plug for Ellexus https://www.ellexus.com/
> Worth mentioning I have no connection with them!

Thanks for the pointer to Ellexus - their I/O profiling does look like
something that could be useful for us. Since we're a bioinformatics shop
and mostly storage-bound rather than network-bound, we haven't really
needed to worry about node network limitations (though occassionally have
had to worry about ToR or chassis switch limitations), but have really
suffered at times when people assume that disk performance is limitless,
and random access is the same as sequential access.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-10 Thread Skylar Thompson
On Sun, Jun 10, 2018 at 06:46:04PM +1000, Chris Samuel wrote:
> On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote:
> 
> > We're a Grid Engine shop, and we have the execd/shepherds place each job in
> > its own cgroup with CPU and memory limits in place.
> 
> Slurm has supports cgroups as well (and we use it extensively), the idea here 
> is more to try and avoid/minimise unnecessary inter-node MPI traffic.

We have very little MPI, but if I had to solve this in GE, I would try to
fill up one node before sending jobs to another. The queue sort order
(defaults to instance load, but can be set to a simple sequence number) is
a general way, while the allocation rule for parallel environments
(defaults to round_robin, but can be set to fill_up) is another specific to
multi-slot jobs.

Not sure the specifics for Slurm, though.

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

2018-06-09 Thread Skylar Thompson
We're a Grid Engine shop, and we have the execd/shepherds place each job in
its own cgroup with CPU and memory limits in place. This lets our users
make efficient use of our HPC resources whether they're running single-slot
jobs, or multi-node jobs. Unfortunately we don't have a mechanism to limit
network usage or local scratch usage, but the former is becoming less of a
problem with faster edge networking, and we have an opt-in bookkeeping 
mechanism 
for the latter that isn't enforced but works well enough to keep people
happy.

On Fri, Jun 08, 2018 at 05:21:56PM +1000, Chris Samuel wrote:
> Hi all,
> 
> I'm curious to know what/how/where/if sites do to try and reduce the impact 
> of 
> fragmentation of resources by small/narrow jobs on systems where you also 
> have 
> to cope with large/wide parallel jobs?
> 
> For my purposes a small/narrow job is anything that will fit on one node 
> (whether a single core job, multi-threaded or MPI).
> 
> One thing we're considering is to use overlapping partitions in Slurm to have 
> a subset of nodes that are available to these types of jobs and then have 
> large parallel jobs use a partition that can access any node.
> 
> This has the added benefit of letting us set a higher priority on that 
> partition to let Slurm try and place those jobs first, before smaller ones.
> 
> We're already using a similar scheme for GPU jobs where they get put into a 
> partition that can access all 36 cores on a node whereas non-GPU jobs get put 
> into a partition that can only access 32 cores on a node, so effectively we 
> reserve 4 cores a node for GPU jobs.
> 
> But really I'm curious to know what people do about this, or do you not worry 
> about it at all and just let the scheduler do its best?
> 
> All the best,
> Chris
> -- 
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] OT, X11 editor which works well for very remote systems?

2018-06-07 Thread Skylar Thompson
vim actually can actually use SCP automatically using URL-style file paths:

http://vim.wikia.com/wiki/Editing_remote_files_via_scp_in_vim

On Thu, Jun 07, 2018 at 07:22:57AM +0800, Deng Xiaodong wrote:
> In this case I would think SSH + nano/vi may be better choice as the data
> transited is less.
> 
> Another way to bypass may be to use scp to copy files between your remote
> and local machines then edit locally?
> 
> 
> On Thu, Jun 7, 2018 at 05:28 David Mathog  wrote:
> 
> > Off Topic.
> >
> > I need to do some work on a system 3000 miles away.  No problem
> > connecting to it with ssh or setting X11 forwarding, but the delays are
> > such that my usual editor (nedit) spends far too much time redrawing to
> > be useful.  Resizing a screen is particularly painful.
> >
> > Are there any X11 GUI editors that are less sensitive to these issues?
> >
> > If not I will just use nano or vim.
> >
> > Thanks,
> >
> > David Mathog
> > mat...@caltech.edu
> > Manager, Sequence Analysis Facility, Biology Division, Caltech
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> >

> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf


-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] batch systems connection

2018-05-29 Thread Skylar Thompson
For software for which there are no pre-built RPMs, I'm a big fan of using
fpm to build the RPMs, rather than trying to write a .spec file by hand:

https://github.com/jordansissel/fpm

You can point fpm at a directory tree, and build a RPM (or .deb, or Solaris
pkg, etc.) for it. There's options for pre-/post-install scripts,
dependencies, requirements, etc. as well. 

On Mon, May 28, 2018 at 02:45:54PM +0300, Mikhail Kuzminsky wrote:
> Sorry, may be my question is not exactly for our Beowulf maillist.
> I have 2 small OpenSuSE-based clusters using different batch systems,
> and want to connect them "grid-like", via CREAM (Computing Resource
> Execution And Management)
> service (I may add
> also one additional common server for both clusters).
> 
> But there is no CREAM binary RPMs for OpenSuSE (Only for CentOS7/SL6
> on UMD site
> //repository.egi.eu/2018/03/14/release-umd-4-6-1/). I did not find:
> where I can download source text of CREAM software ?
> 
> Mikhail Kuzminsky
> Zelinsky Institute of Organic Chemistry
> Moscow
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Pdsh output to multiple windows

2018-04-14 Thread Skylar Thompson
I'm not sure how to do this with pdsh, but I know Ansible can capture
output per task. It doesn't get you output per window though at the point
it's split up per host, you could write that to a named pipe per host and
then read the output anywhere.

Skylar

On Sat, Apr 14, 2018 at 11:22 AM, Lux, Jim (337K) 
wrote:

> Pdsh does most of what I want, but there’s one thing that maybe it does
> (and I’ve not found it), or there’s some clever way to do this..
>
>
>
> What I would like is to have the console output from the N ssh sessions
> return to separate windows..
>
>
>
> Pdsh does a nice job of bringing it all back to one place, but it would be
> nice to have it divided up.. Say you’re doing pdsh  to fire off a sequence
> of apt-get, or ls or something like that which returns multiple lines.. –
> since each node runs at a different rate, the lines come back randomly
> interspersed.
>
>
>
> $pdsh -w b0[1-4] “ls -al /tmp/*”
>
> Or
>
> $pdsh -w b0[1-4] “echo temppwd | sudo -S apt-get update package”
>
>
>
>
>
> I happen to be on a Mac, so there are apps like iTerm2 that apparently do
> this, but I’d like a bit more generic (any *nix)
>
>
>
> I see ClusterSSH in a google search, but haven’t tried it yet.  Apparently
> there’s a flavor called csshX for Mac OS X
>
>
>
>
>
> Any ideas?
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Puzzling Intel mpi behavior with slurm

2018-04-05 Thread Skylar Thompson
At least for Grid Engine/OpenMPI the preferred mechanism ("tight
integration") involves the shepherds running on each exec hosts to start
MPI, without any SSH/RSH required at all. I'm not sure if you've run across
this documentation, but it might help to figure out what's going on:

https://slurm.schedmd.com/mpi_guide.html#intel_mpi

I'm guessing you're using the "srun" method right now.

Skylar

On Thu, Apr 5, 2018 at 8:10 AM, Faraz Hussain  wrote:

> Here's something quite baffling. I have a cluster running slurm but have
> not setup passwordless ssh for a user yet. So when the user runs "mpirun -n
> 2 -hostfile hosts hostname", it will hang because of ssh issue. That is
> expected.
>
> Now the baffling thing is the mpirun command works inside a slurm script!
> How can it work if passwordless ssh has not been configured? Does slurm use
> some different authentication (munge?) to login to the hosts and execute
> the hostname command?
>
> Or does slurm have some fancy behind the scenes integration with Intel mpi
> ?
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Slow RAID reads, no errors logged, why?

2018-03-19 Thread Skylar Thompson
Could it be a patrol read, possibly hitting a marginal disk? We've run into
this on some of our Dell systems, and exporting the RAID HBA logs reveals
what's going on. You can see those with "omconfig storage controller
controller=n action=exportlog" (exports logs in /var/log/lsi_mmdd.log) or
an equivalent MegaCLI command that I can't remember right now. We had a
rash of these problems, along with uncaught media errors (probably a
combination disk/firmware bug), so we ended up sending these logs to
Splunk, but if it's a one-off thing it's pretty easy to spot visually too.

Skylar

On Mon, Mar 19, 2018 at 1:58 PM, David Mathog  wrote:

> On one of our Centos 6.9 systems with a PERC H370 controller I just noticed
> that file system reads are quite slow.  Like 30Mb/s slow.  Anybody care to
> hazard a guess what might be causing this situation?  We have another quite
> similar machine which is fast (A), compared to this (B) which is slow:
>A  B
> RAM512512 GB
> CPUs   48 56  (via /proc/cpuinfo, actually this is threads)
> AdapterH710P  H730
> RAID Level *  *   Primary-5, Secondary-0, RAID Level Qualifier-3
> Size   7.275  9.093   TB
> state  *  *   Optimal
> Drives 5  6
> read rate  54030 Mb/s (dd if=largefile bs=8192 of=/dev/null& ;
> iotop)
> sata disk   ST2000NM0033
> sas disk  ST2000NM0023
> patrol NoNo   (megacli shows patrol read not going now)
>
> ulimit -a on both is:
> core file size  (blocks, -c) 0
> data seg size   (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size   (blocks, -f) unlimited
> pending signals (-i) 2067196
> max locked memory   (kbytes, -l) 64
> max memory size (kbytes, -m) unlimited
> open files  (-n) 6
> pipe size(512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority  (-r) 0
> stack size  (kbytes, -s) 10240
> cpu time   (seconds, -t) unlimited
> max user processes  (-u) 4096
> virtual memory  (kbytes, -v) unlimited
> file locks  (-x) unlimited
>
> Nothing in the SMART values indicating a read problem, although on "B"
> one disk is slowly accumulating events in the write x rereads/rewrites
> measurement (it has 2346, accumulated at about 10 per week).  The value is
> 0 there for reads x rereads/rewrites.  For "B" the smartctl output columns
> are:
>
>  Errors Corrected by Total   Correction GigabytesTotal
>ECCrereads/  errorsalgorithm  processed
>  uncorrected
>fast | delayed rewrites corrected invocations   [10^9 bytes]  errors
>
> read: 934353848  0 0 934353848  0 48544.026 0
> read: 2017672022 0 0 2017672022 0 48574.489 0
> read: 2605398517 3 0 2605398520 3 48516.951 0
> read: 3237457411 1 0 3237457412 1 48501.302 0
> read: 2028103953 0 0 2028103953 0 14438.132 0
> read: 197018276  0 0 197018276  0 48640.023 0
>
> write: 0 0 0 0 0 26394.472 0
> write: 0 0 2346 2346 2346 26541.534 0
> write: 0 0 0 0 0 27549.205 0
> write: 0 0 0 0 0 25779.557 0
> write: 0 0 0 0 0 11266.293 0
> write: 0 0 0 0 0 26465.227 0
>
> verify: 341863005  0 0 341863005  0 241374.368 0
> verify: 866033815  0 0 866033815  0 223849.660 0
> verify: 2925377128 0 0 2925377128 0 221697.809 0
> verify: 1911833396 6 0 1911833402 6 228054.383 0
> verify: 192670736  0 0 192670736  0 66322.573 0
> verify: 1181681503 0 0 1181681503 0 222556.693 0
>
> If the process doing the IO is root it doesn't go any faster.
>
> Oddly if on "B" a second dd process is started on another file it ALSO
> reads at 30Mb/s.  So the disk system then does a total of 60Gb/s, but only
> 30Gb/s per process.  Added a 3rd and a 4th process doing the same.  At the
> 4th it seemed to hit some sort of limit, with each process now consistently
> less than 30Gb/s and the total at maybe 80Gb/s total.  Hard to say what the
> exact total was as it was jumping around like crazy.  On "A" 2 processes
> each got 270Mb/s,
> and 3 180Mb/s.  Didn't try 4.
>
> The only oddness of late on "B" is that a few days ago it loaded too many
> memory hungry processes so the OS killed some.  I have had that happen
> before on other systems without them doing anything odd afterwards.
>
> Any ideas what this slowdown might be?
>
> Thanks,
>
> David Mathog
> mat...@caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 

Re: [Beowulf] Update on dealing with Spectre and Meltdown

2018-03-08 Thread Skylar Thompson
We installed the kernel updates when they became available. Fortunately we
were a little slower on the firmware updates, and managed to rollback the
few we did apply that introduced instability. We're a bioinformatics shop
(data parallel, lots of disk I/O mostly to GPFS, few-to-no
cross-communication between nodes), and actually had some jobs start
running faster, though the group running them came back to us to report
that they had taken advantage of the maintenance window to make some tweaks
to their pipeline.

That's sort of a long way of saying YMMV.

Skylar

On Thu, Mar 8, 2018 at 10:10 AM, Prentice Bisbal  wrote:

> Beowulfers,
>
> Have any of you updated the kernels on your clusters to fix the Spectre
> and Meltdown vulnerabilities? I was following this issue closely for the
> first couple of weeks. There seemed to be a lack of consensus on how much
> these fixed would impact HPC jobs, and if I recall correctly, some of the
> patches really hurt performance, or caused other problems. We took a
> wait-and-see approach here. So now that I've waited a while, what did you
> see?
>
> --
> Prentice
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Storage Best Practices

2018-02-19 Thread Skylar Thompson
For our larger groups, we'll meet with them regularly to discuss their
space usage (and other IT needs). Even that's unlikely to be frequent
enough, so we direct usage alerts to their designated "data manager" if
they're getting close to running out of space. Aside from regularly
clearing out scratch space, though, we try to stay out of deletion
decisions, to avoid problems stemming from miscommunication between the
group and IT, or even within the group (left hand doesn't know what the
right hand is doing).

Skylar

On Mon, Feb 19, 2018 at 6:12 AM, Richter, Brian J {BIS} <
brian.j.rich...@pepsico.com> wrote:

> Hey All,
>
>
>
> I was hoping to get some recommendations for Storage. Last year we set up
> our first HPC and I’m looking for a good strategy moving forward for
> Storage. We set up a dedicated space on the cluster for Storage that has
> 5.5 TB of space. This space can be quickly chewed up depending on the
> project the business is working on. Do you guys typically do a retention
> policy on the cluster to ensure there is always enough space? I was
> thinking anything older than a month should be cleaned up.
>
>
>
> Thanks!
>
>
>
> *Brian J. Richter*
>
> Global R Senior Analyst *•* Information Technology
>
> 617 W Main St, Barrington, IL 60010
> Office: 847-304-2356 <(847)%20304-2356> *• *Mobile: 847-305-6306
> <(847)%20305-6306>
>
> brian.j.rich...@pepsico.com
>
>
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Cluster Authentication (LDAP,NIS,AD)

2017-12-29 Thread Skylar Thompson
It's a mechanism for having the automounter process run an executable as
part of the mount process. The executable takes in the map key as its
sole argument (i.e. /net/foo/bar would produce bar as an argument) and
then will print the mount parameters over STDOUT. We use a Python script
with a YAML configuration file (easy to edit and validate) but it can be
any executable type.

I don't know that this is available for amd, but it is for autofs.

Skylar

On 12/28/2017 12:47 PM, John Hearns via Beowulf wrote:
> Skylar, I admit my ignorance. What is a program map?
> Where I work now extensively uses automounter maps for bind mounts.
> I may well learn something useful here.
> 
> On 28 December 2017 at 15:28, Skylar Thompson <skylar.thomp...@gmail.com
> <mailto:skylar.thomp...@gmail.com>> wrote:
> 
> We are an AD shop, with users, groups, and automounter maps (for a short
> while longer at least[1]) in the directory. I think once you get to
> around schema level 2003R2 you'll be using RFC2307bis (biggest
> difference from RFC2307 is that it supports nested groups) which is
> basically what modern Linux distributions will be expecting. I can't
> think of any serious problems we've had it with it, though I work on the
> UNIX side so for me it really does just look like a LDAP/Krb5 server.
> 
> I'm not a fan of Microsoft in general, but AD is one of the few products
> that they've actually gotten right. In particular, the replication just
> works --- in the 11 years we've been running AD, I can't think of a
> single time our domain servers got out of sync.
> 
> [1] For automounter maps, we're in the process of moving from LDAP to
> program maps. Due to some internal complexities, we need to support
> multiple definitions for a single mount point, which is easiest to
> accomplish with a client-side program map.
> 
> Skylar
> 
> On 12/27/2017 08:41 PM, Robert Taylor wrote:
> > Hi cluster gurus. I want to pick the your collective brains.
> > Right now, where I work, we have and isilon, and netapp, which we use
> > for our small 250core compute cluster.
> >
> > We have NIS for authentication and automount maps on the cluster side,
> > and AD for authentication on the windows side, and LDAP for yet for
> > other things to authenticate against.  
> > The storage is connected to both nis and AD, and does it's best to
> match
> > the two sides up. 
> > We have had some odd issues with authentication as of late with
> sources
> > getting out of sync, which has brought up the discussion for
> > consolidating down to a single source of truth, which would be AD.
> > RFC2307 talks about stuffing NIS data into LDAP/AD, and there are
> > commercial products such as centrify that can do it. 
> >
> > Does anyone run an entirely AD authentication environment with their
> > compute cluster
> > authenticating against it and using it for automount maps and such?
> > Can you tell me what were your reasons for going that way, and any
> snags
> > that you hit on the way?
> >
> > We've just started looking at it, so I'm on the beginning of this
> road. 
> >
> > Any responses is appreciated. 
> >
> > Thanks.
> >
> > rgt
> >
> >
> > ___
> > Beowulf mailing list, Beowulf@beowulf.org
> <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> <http://www.beowulf.org/mailman/listinfo/beowulf>
> >
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org
> <mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> <http://www.beowulf.org/mailman/listinfo/beowulf>
> 
> 
> 
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Cluster Authentication (LDAP,NIS,AD)

2017-12-28 Thread Skylar Thompson
We are an AD shop, with users, groups, and automounter maps (for a short
while longer at least[1]) in the directory. I think once you get to
around schema level 2003R2 you'll be using RFC2307bis (biggest
difference from RFC2307 is that it supports nested groups) which is
basically what modern Linux distributions will be expecting. I can't
think of any serious problems we've had it with it, though I work on the
UNIX side so for me it really does just look like a LDAP/Krb5 server.

I'm not a fan of Microsoft in general, but AD is one of the few products
that they've actually gotten right. In particular, the replication just
works --- in the 11 years we've been running AD, I can't think of a
single time our domain servers got out of sync.

[1] For automounter maps, we're in the process of moving from LDAP to
program maps. Due to some internal complexities, we need to support
multiple definitions for a single mount point, which is easiest to
accomplish with a client-side program map.

Skylar

On 12/27/2017 08:41 PM, Robert Taylor wrote:
> Hi cluster gurus. I want to pick the your collective brains.
> Right now, where I work, we have and isilon, and netapp, which we use
> for our small 250core compute cluster.
> 
> We have NIS for authentication and automount maps on the cluster side,
> and AD for authentication on the windows side, and LDAP for yet for
> other things to authenticate against.  
> The storage is connected to both nis and AD, and does it's best to match
> the two sides up. 
> We have had some odd issues with authentication as of late with sources
> getting out of sync, which has brought up the discussion for
> consolidating down to a single source of truth, which would be AD.
> RFC2307 talks about stuffing NIS data into LDAP/AD, and there are
> commercial products such as centrify that can do it. 
> 
> Does anyone run an entirely AD authentication environment with their
> compute cluster
> authenticating against it and using it for automount maps and such?
> Can you tell me what were your reasons for going that way, and any snags
> that you hit on the way?
> 
> We've just started looking at it, so I'm on the beginning of this road. 
> 
> Any responses is appreciated. 
> 
> Thanks.
> 
> rgt
> 
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Thoughts on git?

2017-12-19 Thread Skylar Thompson
90% of the battle is using a VCS to begin with. Whether that's SVN, git,
Mercurial, etc. is somewhat irrelevant - just pick something with the
features (and ease of use is a feature!) that you and your team need and
stick with it.

In my professional life, I've found SVN to suit my needs and be easy
enough for other folks I work with (even ones without software
development expertise) to use.

Skylar

On 12/19/2017 08:11 AM, Faraz Hussain wrote:
> I am curious what people think of git. On one hand everyone seems to be
> using it and proclaiming its virtues. On the other hand it seems way
> overkill for how the majority of people code.
> 
> I maintain dozens of scripts to manage various HPC environments . None
> are more than a few hundred lines long. To do backups of scripts, I just
> copy them to some backup folder. Occasionally I might tar them up and
> copy them to a different server. I have never had a need to go back to
> an older version of my script.
> 
> So I tried to learn git but find it very confusing. It seems designed
> for teams of developers working on some million+ line of code project.
> For my rinky-dinky scripts it just adds a lot of confusion. It seems I
> need to "commit" to using git everyday in order for it to be effective.
> Otherwise, use it or lose it.
> 
> Should I force myself to use git everyday? Or maybe find some
> incrementally better way to manage backups of my scripts?
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-08 Thread Skylar Thompson
I would also suspect a thermal issue, though it could also be firmware. To
verify a temperature problem, you might try setting up lm_sensors or
scraping "ipmitool sdr" output (whichever is easier) regularly and try to
make a performance-vs-temperature plot for each node. As Andrew mentioned,
it could also be firmware/CPU microcode. We recently tracked down a problem
with some of our nodes that ended up being microcode-related; the CPUs
would start in a high-power state, but end up getting stuck in a low-power
state, regardless of what power management settings we had set in the BIOS.

Skylar

On Fri, Sep 8, 2017 at 7:41 PM, Prentice Bisbal  wrote:

> Beowulfers,
>
> I need your assistance debugging a problem:
>
> I have a dozen servers that are all identical hardware: SuperMicro servers
> with AMD Opteron 6320 processors. Every since we upgraded to CentOS 6, the
> users have been complaining of wildly inconsistent performance across these
> 12 nodes. I ran LINPACK on these nodes, and was able to duplicate the
> problem, with performance varying from ~14 GFLOPS to 64 GFLOPS.
>
> I've identified that performance on the slower nodes starts off fine, and
> then slowly degrades throughout the LINPACK run. For example, on a node
> with this problem, during first LINPACK test, I can see the performance
> drop from 115 GFLOPS down to 11.3 GFLOPS. That constant, downward trend
> continues throughout the remaining tests. At the start of subsequent tests,
> performance will jump up to about 9-10 GFLOPS, but then drop to 5-6 GLOPS
> at the end of the test.
>
> Because of the nature of this problem, I suspect this might be a thermal
> issue. My guess is that the processor speed is being throttled to prevent
> overheating on the "bad" nodes.
>
> But here's the thing: this wasn't a problem until we upgraded to CentOS 6.
> Where I work, we use a read-only NFSroot filesystem for our cluster nodes,
> so all nodes are mounting and using the same exact read-only image of the
> operating system. This only happens with these SuperMicro nodes, and only
> with the CentOS 6 on NFSroot. RHEL5 on NFSroot worked fine, and when I
> installed CentOS 6 on a local disk, the nodes worked fine.
>
> Any ideas where to look or what to tweak to fix this? Any idea why this is
> only occuring with RHEL 6 w/ NFS root OS?
>
> --
> Prentice
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] How to debug slow compute node?

2017-08-10 Thread Skylar Thompson
We ran into something similar, though it turned out being a microcode bug
in the CPU that caused it to remain stuck in its lowest power state.
Fortunately it was easily testable with "perf stat" so it was pretty clear
which nodes were impacted, which also happened to be bought as a batch with
a unique CPU version. By the time we did our legwork, the vendor had
independently announced a fix for the problem, so I guess we could have
just saved ourselves some work and waited...

Skylar

On Thu, Aug 10, 2017 at 3:59 PM, Lance Wilson 
wrote:

> Hi Faraz,
> Another one that we have seen was a difference in power profile of the
> node. It caused the node in certain situations to keep the cpu speed low,
> so top looked fine and everything looked fine, just slow. It was a Dell box
> as well. It was interesting that there were so many power settings that
> caused slow downs with Centos 7.
>
> Cheers,
>
> Lance
> --
> Dr Lance Wilson
> Senior HPC Consultant
> Ph: 03 99055942 (+61 3 99055942 <+61%203%209905%205942>
> Mobile: 0437414123 (+61 4 3741 4123)
> Multi-modal Australian ScienceS Imaging and Visualisation Environment
> (www.massive.org.au)
> Monash University
>
> On 11 August 2017 at 04:33, John Hearns via Beowulf 
> wrote:
>
>> Ten euros for me on a faulty DIMM
>>
>>
>>
>>
>>
>> Sent from Mail  for
>> Windows 10
>>
>>
>>
>> *From: *Andrew Holway 
>> *Sent: *Thursday, 10 August 2017 20:05
>> *To: *Gus Correa 
>> *Cc: *Beowulf Mailing List 
>> *Subject: *Re: [Beowulf] How to debug slow compute node?
>>
>>
>>
>> I put €10 on the nose for a faulty power supply.
>>
>>
>>
>> On 10 August 2017 at 19:45, Gus Correa  wrote:
>>
>> + Leftover processes from previous jobs hogging resources.
>> That's relatively common.
>> That can trigger swapping, the ultimate performance killer.
>> "top" or "htop" on the node should show something.
>> (Will go away with a reboot, of course.)
>>
>> Less likely, but possible:
>>
>> + Different BIOS configuration w.r.t. the other nodes.
>>
>> + Poorly sat memory, IB card, etc, or cable connections.
>>
>> + IPMI may need a hard reset.
>> Power down, remove the power cable, wait several minutes,
>> put the cable back, power on.
>>
>> Gus Correa
>>
>> On 08/10/2017 11:17 AM, John Hearns via Beowulf wrote:
>>
>> Another thing to perhaps look at. Are you seeing messages abotu thermal
>> throttling events in the system logs?
>> Could that node have a piece of debris caught in its air intake?
>>
>> I dont think that will produce a 30% drop in perfoemance. But I have
>> caught compute nodes with pieces of packaking sucked onto the front,
>> following careless peeople unpacking kit in machine rooms.
>> (Firm rule - no packaging in the machine room. This means you)
>>
>>
>>
>>
>> On 10 August 2017 at 17:00, John Hearns  hear...@googlemail.com>> wrote:
>>
>> ps.   Look at   watch  cat /proc/interrupts   also
>> You might get a qualitative idea of a huge rate of interrupts.
>>
>>
>> On 10 August 2017 at 16:59, John Hearns > > wrote:
>>
>> Faraz,
>> I think you might have to buy me a virtual coffee. Or a beer!
>> Please look at the hardware health of that machine. Specifically
>> the DIMMS.  I have seen this before!
>> If you have some DIMMS which are faulty and are generating ECC
>> errors, then if the mcelog service is enabled
>> an interrupt is generated for every ECC event. SO the system is
>> spending time servicing these interrupts.
>>
>> So:   look in your /var/log/mcelog for hardware errors
>> Look in your /var/log/messages for hardware errors also
>> Look in the IPMI event logs for ECC errors:ipmitool sel elist
>>
>> I would also bring that node down and boot it with memtester.
>> If there is a DIMM which is that badly faulty then memtester
>> will discover it within minutes.
>>
>> Or it could be something else - in which case I get no coffee.
>>
>> Also Intel cluster checker is intended to exacly deal with these
>> situations.
>> What is your cluster manager, and is Intel CLuster Checker
>> available to you?
>> I would seriously look at getting this installed.
>>
>>
>>
>>
>>
>>
>>
>> On 10 August 2017 at 16:39, Faraz Hussain >
>> > wrote:
>>
>> One of our compute nodes runs ~30% slower than others. It
>> has the exact same image so I am baffled why it is running
>> slow . I have tested OMP and MPI benchmarks. Everything runs
>> slower. The cpu usage goes to 2000%, so all looks normal
>> there.
>>
>> I thought it may have to 

Re: [Beowulf] HPC and Licensed Software

2017-04-14 Thread Skylar Thompson
Back when we had software requiring MathLM/FlexLM, we just used NAT to get
the cluster nodes talking to the licensing server. We also had a consumable
in Grid Engine so that people could keep their jobs queued if there were no
licenses available.

Skylar

On Fri, Apr 14, 2017 at 12:23 PM, Mahmood Sayed 
wrote:

> We've used both NAT and fully routable private networks up to 1000s of
> nodes. NAT was a little more secure fire or needs.
>
> On Apr 14, 2017, at 2:41 PM, Richter, Brian J {BIS} <
> brian.j.rich...@pepsico.com> wrote:
>
> Thanks a lot, Ed. I will be going the NAT route!
>
>
>
> *Brian J. Richter*
>
> Global R Senior Analyst *•* Information Technology
>
> 617 W Main St, Barrington, IL 60010
> Office: 847-304-2356 <(847)%20304-2356> *• *Mobile: 847-305-6306
> <(847)%20305-6306>
>
> brian.j.rich...@pepsico.com
>
>
>
>
>
> *From:* Swindelles, Ed [mailto:ed.swindel...@uconn.edu
> ]
> *Sent:* Friday, April 14, 2017 1:40 PM
> *To:* Richter, Brian J {BIS} ;
> beowulf@beowulf.org
> *Subject:* Re: [Beowulf] HPC and Licensed Software
>
>
>
> Hi Brian -
>
>
>
> For a couple years we did NAT through our head node for checking out
> licenses (including StarCCM) and talking to Red Hat Satellite. We’re now
> transitioning to fully routed networks for our compute nodes, but NAT did
> the job well.
>
>
>
> --
>
> Ed Swindelles
>
> Manager of Advanced Computing
>
> University of Connecticut
>
>
>
> On Apr 14, 2017, at 2:34 PM, Richter, Brian J {BIS} <
> brian.j.rich...@pepsico.com> wrote:
>
>
>
> Hi All,
>
>
>
> We just built our first HPC and I have what seems like a rather dumb
> question regarding best practices and compute nodes. Our HPC setup is
> currently small, we have a HeadNode which runs MOAB/Torque and we have 4
> compute nodes connected with IB. The compute nodes are on their own private
> network. My question is, what is the best way to handle software that
> requires licenses to run on the compute nodes? For instance we are trying
> to get STARCCM+ up and running on the cluster, the license server is on our
> general user network, the headnode is also on the user network, but when we
> submit a job to torque for STARCCM it keeps throwing errors that the
> software cannot connect to the license server. Is it best practice to
> create a network path from the compute nodes out to the license server
> through the HeadNode? I feel like I’m missing something really simple here.
>
>
>
> Thanks!
>
>
>
> *Brian J. Richter*
>
> Global R Senior Analyst *•* Information Technology
>
> 617 W Main St, Barrington, IL 60010
> Office: 847-304-2356 <(847)%20304-2356> *• *Mobile: 847-305-6306
> <(847)%20305-6306>
>
> brian.j.rich...@pepsico.com
>
>
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] solaris?

2017-02-14 Thread Skylar Thompson
It has a minor role for us for storage (ZFS), but we're retiring our
Solaris boxes as quickly as we can in favor of more GPFS.

Skylar

On 02/14/2017 01:28 PM, Michael Di Domenico wrote:
> just out of morbid curiosity, does Solaris even have a stake in HPC
> anymore?  I've not heard boo about it in quite awhile and there
> doesn't appear to be even one system on the top500 running it.
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Suggestions to what DFS to use

2017-02-13 Thread Skylar Thompson
Is there anything in particular that is causing you to move away from GPFS?

Skylar

On 02/12/2017 11:55 PM, Tony Brian Albers wrote:
> Hi guys,
> 
> So, we're running a small(as in a small number of nodes(10), not 
> storage(170TB)) hadoop cluster here. Right now we're on IBM Spectrum 
> Scale(GPFS) which works fine and has POSIX support. On top of GPFS we 
> have a GPFS transparency connector so that HDFS uses GPFS.
> 
> Now, if I'd like to replace GPFS with something else, what should I use?
> It needs to be a fault-tolerant DFS, with POSIX support(so that users 
> can move data to and from it with standard tools).
> 
> I've looked at MooseFS which seems to be able to do the trick, but are 
> there any others that might do?
> 
> TIA
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] clusters of beagles

2017-01-28 Thread Skylar Thompson
On 01/27/2017 12:14 PM, Lux, Jim (337C) wrote:
> The pack of Beagles do have local disk storage (there¹s a 2GB flash on
> board with a Debian image that it boots from).
> 
> The LittleFe depends on the BCDD (i.e. ³CD rom with cluster image²,
> actually a USB stick) which is the sort of thing I was hoping for, but it
> is x86.
> OTOH, maybe that¹s a pattern to start with.  the BCDD also runs out of
> RAM, which may or may not be a good model.
> 
> An interesting challenge

BCCD actually does support a "liberated" mode (RAM disk copied to
persistent storage). We're also not tied to x86 - we actually used to
have a PPC port, and are considering supporting ARM now that there's
some educational-scale HPC platforms (small multi-core boards w/ 2+GB
RAM, GPGPU, wifi, on-board wired Ethernet) available.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] non-stop computing

2016-10-25 Thread Skylar Thompson
Assuming you can contain a run on a single node, you could use
containers and the freezer controller (plus maybe LVM snapshots) to do
checkpoint/restart.

Skylar

On 10/25/2016 11:24 AM, Michael Di Domenico wrote:
> here's an interesting thought exercise and a real problem i have to tackle.
> 
> i have a researchers that want to run magma codes for three weeks or
> so at a time.  the process is unfortunately sequential in nature and
> magma doesn't support check pointing (as far as i know) and (I don't
> know much about magma)
> 
> So the question is;
> 
> what kind of a system could one design/buy using any combination of
> hardware/software that would guarantee that this program would run for
> 3 wks or so and not fail
> 
> and by "fail" i mean from some system type error, ie memory faulted,
> cpu faulted, network io slipped (nfs timeout) as opposed to "there's a
> bug in magma" which already bit us a few times
> 
> there's probably some commercial or "unreleased" commercial product on
> the market that might fill this need, but i'm also looking for
> something "creative" as well
> 
> three weeks isn't a big stretch compared to some of the others codes
> i've heard around the DOE that run for months, but it's still pretty
> painful to have a run go for three weeks and then fail 2.5 weeks in
> and have to restart.  most modern day hardware would probably support
> this without issue, but i'm looking for more of a guarantee then a
> prayer
> 
> double bonus points for anything that runs at high clock speeds >3Ghz
> 
> any thoughts?
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Generation of strings MPI fashion..

2016-10-08 Thread Skylar Thompson
If you haven't used MPI before, I would break this up into chunks:

1. Write a MPI program that gathers a fixed amount of /dev/urandom
(Brian's suggestion is wise) data and sends it back to the master. This
will get you used to the basic MPI_Send/MPI_Recv commands.

2. Use the same program, but use MPI_Gather rather than MPI_Recv to
assemble the final array. That will get you used to collective
communication.

3. Find a parallel sorting algorithm and implement it. You'll still need
to have rank 0 do some work, but the point of the algorithm would be to
reduce the amount of work it has to do. I found a good description of
parallel merge sort here:

http://penguin.ewu.edu/~trolfe/ParallelMerge/ParallelMerge.html

Good luck!

Skylar

On 10/07/2016 10:19 AM, Darren Wise wrote:
> Heya folks, 
> 
> This may seem really simple to some and it is fairly simple from the
> terminal using bash of generation and sorting (examples below)
> 
> What I would like to do, to get started with making a program to run in
> MPI on my cluster.. I thought of a fairly simple bash script and this
> works fine for a single machine but what is the best way to go around it
> or converting this simple notion in to an MPI runable command.
> 
> Generates random strings of chars:
> $ tr -cd '[upper:]' < /dev/random | fold -w9 | head -c${1:-1000} | tee
> somefile.txt
> 
> Removes duplicate lines:
> $ sort filename.txt | uniq
> 
> This is fine for generation as I mention for a single machine, but
> what's the best way to turn this around and use MPI with a shared NFS
> mounted filesystem..
> 
> While this is an example im going to generate a bash script to allow me
> to generate every possible permutation of a desired string length along
> with types of chars a-z, A-Z, 0-9 along with symbols..
> 
> So any advice is welcome really and it's just an educational way for me
> to transpire into generating code that can be used within my small beowulf.
> 
>> Kind regards,
>> Darren Wise Esq.
>>
>> www.wisecorp.co.uk 
>> www.wisecorp.co.uk/babywise 
>> www.darrenwise.co.uk 
> 
> 
> ___
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] User notification of new software on the cluster

2016-09-27 Thread Skylar Thompson
On 09/27/2016 04:42 PM, Christopher Samuel wrote:
> On 28/09/16 06:02, Rob Taylor wrote:
> 
>> I wanted to ask how people announce new versions of software that has
>> been installed on their cluster to their user base.
> 
> We're pretty lucky, most users know to use "module avail" to see what's
> there and given we tend to install a couple of new packages a day (yay
> bioinformatics) any active update mechanism would probably be overwhelming.
> 
> We did used to have a list of software on our website but it was
> impossible to keep up to date.

We also point folks to "module avail", though now that we're getting
close to 1000 distinct software installs in Modules, maybe we'll have to
do something more advanced sometime soon.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Netapp EF540 password reset

2016-08-26 Thread Skylar Thompson
I don't have any info to help, but you might try asking on the lopsa-tech
list as well. I know there's Netapp admins over there who might be able to
help.

Skylar

On Friday, August 26, 2016, dave.b...@diamond.ac.uk 
wrote:

> Hi All,
>
> We have a Netapp EF540 that has a management password set.  Long story
> short I would like to reset the management password where the original is
> not known.  Currently connecting over serial I have the ability to enter
> the Service interface.  But I do not have any details or credentials for
> this.  I believe the process is as follows:
>
>
> For 6.xx Controller Firmware releases:
>
>   1.  Exit SANtricity and/or any active SMCli session.
>   2.  Log in to the controller shell using rlogin or a serial connection
> (either controller works).
> -> loadDebug
> -> clearSYMbolPassword
> -> unld “ffs:Debug”
>   3.  When the GUI is re-launched, it will be accessible and no longer
> password-protected.
>
>  For 7.xx and 8.xx Controller Firmware releases:
>
>   1.  Exit the GUI and run the following shell commands on the Controller:
> -> loadDebug
> -> setSAPassword_MT ""
> -> unld "Debug"
> This will set the password to an empty string.
>   2.  Remove the array from the Enterprise Management Window and re-add it.
> When the GUI is re-launched, it will be accessible and no longer
> password-protected.
>   3.  Re-access the array.
>
> Would anyone be able to confirm this and / or let me know the service
> password?
>
>
> Regards
>
> Dave
>
> --
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
> ___
> Beowulf mailing list, Beowulf@beowulf.org  sponsored by
> Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Demo-cluster ideas

2016-03-08 Thread Skylar Thompson
Thanks for the report back! The color-coding is a nifty idea.
Skylar

On 03/07/2016 08:44 AM, Olli-Pekka Lehto wrote:
> First iteration of the mini-cluster is now in production. Some takeaways and
> observations:
> 
> We did the first deployment of the mini-cluster, called "Sisunen" last weekend
> at the local science centre. We ended up developing the concept so that kids
> could actually construct the cluster. We held 7 build sessions, each with 10
> participants, over 2 days. Each session took about 30min. This really made it 
> a
> really tangible experience. Things went very smoothly, largely thanks to the
> really good team of HPC specialists that ran the workshops.
> 
> Here’s a video of the end product: 
> https://www.instagram.com/p/BCnGg0VI6q-A3ft1NjXe13uPhBxoS99WjcX3qM0/
> 
> 
> We ended up with following demos:
> 
> SPH from Tiny Titan - This was probably the most popular demo. We amped up the
> particle count by 10x compared to the Raspberry Pi version and got it working
> nicely.
> 
> PiBrot from Tiny Titan - Helped show the difference in parallel performance
> between 1 and 10 nodes. Explaining fractals in simple enough terms was a bit
> challenging though.
> 
> Game of Life - One of our standard MPI training projects, spruced up for the
> demos a bit.
> 
> NAMD+VMD - It was nice to show a real world code (molecular dynamics). 
> Finding a
> better input set might be useful.
> 
> Blender 3D - We ran out of time with this a bit. Works but Still need to set 
> up 
> a nice scene to render with a good balance of amount of frames, wow-factor 
> and 
> render time.
> 
> The codes will be made available here and hopefully we’ll release more in the
> future: https://github.com/sisunen We also welcome all contributions, of 
> course
> :)
> 
> In the future it would be nice to get an interactive deep learning demo set 
> up.
> Possibly neural style (https://github.com/jcjohnson/neural-style) and/or Deep
> Dream (https://github.com/google/deepdream). If anyone is up for a nice ML
> project for themselves or possibly some grad student then productising these 
> for
> demo cluster use might not be a bad idea. :)
> 
> Also the blink(1) USB LED indicators really helped illustrate how things were
> parallelized by color-coding the different tasks and setting the LEDs to
> correspond. Any demo code should ideally have support for these.
> 
> O-P

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Demo-cluster ideas

2016-02-01 Thread Skylar Thompson
Hi Olli-Pekka,

When we have LittleFe (http://littlefe.net) out in the wild (which
sounds a lot like what you're trying to do!), GalaxSee and Game of Life
are two favorites:

http://shodor.org/petascale/materials/UPModules/NBody/
http://shodor.org/petascale/materials/UPModules/GameOfLife/

They're simple enough to be understandable, are visual so even if you
don't grok the algorithm right away you can still get something out of
it, but still complex enough to be a good HPC show-case.

Skylar

On 02/01/2016 07:57 AM, Olli-Pekka Lehto wrote:
> We're in the process of developing a demo Beowulf cluster in the spirit of 
> Tiny Titan (http://tinytitan.github.io/). However, the one we are working on 
> is based on 10 Intel i7 small-form-factor boards with a fairly good per-core 
> compute capability.
> 
> I'd be interested to hear if you have some ideas (or even pointers to code!) 
> for demonstrating HPC concepts to the general public. Especially having an 
> element of interactivity would be nice. 
> 
> Couple of low-hanging fruits that we are already looking at are 
> Blender/povray-rendering and porting the existing Tiny Titan codes.
> 
> Best regards,
> Olli-Pekka
> 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] job scheduler and accounting question

2015-07-14 Thread Skylar Thompson
On 07/14/2015 04:17 PM, Stuart Barkley wrote:
 On Tue, 14 Jul 2015 at 15:06 -, Joe Landman wrote:
 
 Has the gridengine mess ever been sorted out?
 
 We are using Son of Grid Engine http://arc.liv.ac.uk/SGE/ which
 seems to be not-dead-yet.  It does seem to rely heavily on just one or
 two developers at this point.
 
 Stuart
 

We're using Univa ourselves. The support has been quite good, and the
developers are responsive to feature requests. The accounting gets
written to a colon-delimited file which is easy to parse w/
awk/Perl/Python/etc. They also have an accounting product we haven't
setup yet.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Software installation across cluster

2015-05-26 Thread Skylar Thompson
We also use Environment Modules. One of the great things about Modules
is that the modulefiles are written in Tcl, which means you can make
them arbitrarily complex. We have a common header that's sourced for all
of our module files that automatically sets common environment variables
based on the install path of a piece of software. By keeping consistent
install paths, this lets us keep our module files quite short (sometimes
only a couple lines):

software-name/software-version/OS/distro/ISA

For instance:

VCFtools/0.1.11/Linux/RHEL6/x86_64

This lets us support any number of platforms, and having each component
of a platform at a common location makes it easy to parse out in the
modules header. All of our module-installed software is accessible over
NFS from any cluster node. NFS works fine for almost any piece of
software; the exception being R due to the huge number of files it has.
Fortunately no one expects R to perform well...

Regardless of how you manage software, I strongly recommend you not let
it provide users with a default software version. That necessarily
changes over time, and will surprise users who decide to use it. It's
far better to force users to choose versions explicitly than to let them
shoot themselves in the foot.

Skylar

On 05/26/2015 02:45 PM, Trevor Gale wrote:
 Hello all,
 
 I was wondering what solutions other use for easy software installation 
 across clusters. Is there any method that is generally accepted to be the 
 most effective for making sure that all the software is consistent across 
 each node?
 
 Thanks,
 Trevor
 ___
 Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
 To change your subscription (digest mode or unsubscribe) visit 
 http://www.beowulf.org/mailman/listinfo/beowulf
 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Cluster networking/OSs

2015-05-08 Thread Skylar Thompson
Hi Trevor,

I'm another BCCD developer. Since we target lower-end clusters, we don't
support Infiniband, although as a Debian-based distribution it wouldn't be
hard to install. Most of the software we support is pedagogical in nature -
N-body simulations, numerical methods, etc. Our emphasis is on making BCCD
easy to use for non-computer people, so we're not targeting the high-end at
all.

Skylar

On Fri, May 8, 2015 at 8:35 PM, Trevor Gale tre...@snowhaven.com wrote:

 I'll definitely check out BCCD.

 Thanks for the detailed response Jorg! Most of the clusters that I've
 worked with are generally used for large scale parallel jobs, I would be
 very interesting to learn more about your large cluster for running jobs
 like this. Is there any software your running across the nodes other than
 the InifiniBand, DHPC, PXE and local DNS?

 Thanks,
 Trevor

 On May 8, 2015, at 4:55 PM, Aaron Weeden amweeden.earl...@gmail.com
 wrote:

 Hi Trevor,

 Not to toot my own horn here, but BCCD is designed with education in mind:
 http://bccd.net

 Aaron

 On Fri, May 8, 2015 at 3:30 PM, Trevor Gale tre...@snowhaven.com wrote:

 Hey Everyone,

 I'm fairly new to linux clusters and am trying to learn as much as I can
 about specifically networking on clusters and different operating systems
 run across clusters. Does anyone know any good resources to learn from?

 Thanks,
 Trevor

 ___
 Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
 To change your subscription (digest mode or unsubscribe) visit
 http://www.beowulf.org/mailman/listinfo/beowulf




 ___
 Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
 To change your subscription (digest mode or unsubscribe) visit
 http://www.beowulf.org/mailman/listinfo/beowulf


___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RAID question

2015-03-14 Thread Skylar Thompson

On 3/13/2015 5:52 PM, mathog wrote:

A bit off topic, but some of you may have run into something similar.

Today I was called in to try and fix a server which had stopped 
working.  Not my machine, the usual sysop is out sick.  The

model is a Dell PowerEdge T320 with a Raid PERC H710P controller.

The symptoms reported were it stopped working, could not find 'ls', 
and wouldn't reboot past grub.  (Evidently it could find 'reboot'.)


Got into the BIOS and ran RAID consistency check, which took 3 hours.  
It didn't say if it had passed or failed, or put up any sort of status 
message whatsoever, but there were no failure lights lit on the disks.


On a reboot it gives:

  grub error 8: kernel must be loaded before booting.

It is a Centos 6.5 system, so booted it with an installation disk of 
that flavor, and dropped down into a shell.


This is where it gets strange.

/boot is in /dev/sdb1.  When mounted that directory is empty but
when unmounted fsck shows 10 files in it taking up about 12Mb. Pretty 
clear why it wouldn't boot with nothing in /boot.  Not sure
what the 10 files fsck sees are, perhaps part of the filesystem. (ext2 
I think).  I had never tried running fsck on an empty file system in a 
partition before.


/bin is missing entirely, so that's why ls stopped working. /usr/bin 
is still there, which is why reboot was OK.


/var/log/messages shows that the machine was logging what look like 
corrected disk errors (sense errors) for /dev/sdb1 for days before it 
failed.


Tried copying the contents of another machine's /boot (which is 
supposed to be an exact copy of this one) into /boot, and rebooting,
but grub didn't get any farther than it had before.  Probably grub 
needs to be reinstalled, but with /bin missing, and who knows what 
else gone besides, it seems like a full OS reinstall would be in order.


Off the top of my head, if it weren't for the sense errors on 
/dev/sdb1, I would think that this might have been the result of an 
accidental (or hacker's)


  rm -rf /

Anybody run into a hardware/software glitch with symptoms like this on 
a similar system???


Is there some way on these sorts of Dell's to run per disk diagnostics 
from BIOS or UEFI even if they are already grouped into a virtual disk 
by the controller?  I suspect that the disk which is /dev/sdb may 
really be on its way out, but I couldn't get smartctl to work off the 
DVD or from the copy on disk.   (The smartctl commands used were 
tested on the twin machine, and they worked there.)  The BIOS showed 
that SMART was disabled on all of the disks.  Web searches for 
diagnostics for this controller all referenced software that requires 
a running OS, nothing built into the BIOS/UEFI.  (It is set to use BIOS.)


I might start looking at non-RAID problems first. Maybe you have some 
bad memory or CPU? Errant rm could do it too, as you mentioned.


Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Replacement for C3 suite from ORNL

2015-02-28 Thread Skylar Thompson
On 02/27/2015 02:34 PM, Fabricio Cannini wrote:
 Hello all
 
 Does anybody know of a replacement for the C3 suite of commands ( cpower
 ...) that is very common in SGI machines and used to be available here:
 
 http://www.csm.ornl.gov/torc/C3/

We've been using pdsh:

https://code.google.com/p/pdsh/

Skylar

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] HPC demonstrations

2015-02-11 Thread Skylar Thompson
On 02/10/2015 02:52 AM, John Hearns wrote:
 I am giving an internal talk tomorrow, a lightweight introduction to HPC.
 
  
 
 Can anyone suggest any demonstrations of HPC which I could give –
 something visual?
 
 Or are there any videos I could pick up from somewhere?
 
  
 
 I ahad thought on showing an H-bomb test video from Youtube, and saying
 “Well, you aran’t allowed to do that any more”

Hi John,

It might be too late, but I like to use GalaxSee[1] and Life[2] as HPC
demos. They're now maintained by the Shodor Foundation, although
Galaxsee's heritage predates it. I've used them in the context of the
Bootable Cluster CD although you can run anywhere with X11, MPI and a C
compiler.

What's nice about them:

1. They're visual in a way that emphasizes the parallelism of the algorithms

2. They're simple enough for even beginning computational scientists to
understand

3. They're easily tunable for different problem set sizes

Skylar

[1] http://bccd.net/wiki/index.php/GalaxSee

[2] http://bccd.net/wiki/index.php/Game_of_Life
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] IPoIB failure

2015-01-27 Thread Skylar Thompson
On 01/27/2015 02:24 PM, Christopher Samuel wrote:
 On 24/01/15 01:29, Lennart Karlsson wrote:
 
 This reminds me of when we upgraded to SL-6.6 (approximately the same as
 CentOS-6.6 and RHEL-6.6).

 The new kernel we got, could not handle our IPoIB for storage traffic,
 which broke down within a few hours.
 
 Interesting, we use GPFS over IPoIB and upgraded to RHEL 6.6 in early
 November and haven't seen any issues at all (and with a lot of
 bioinfomatics users we'd notice problems pretty quickly).
 
 Is your IB running in connected mode or datagram mode?
 
 We're in connected mode everywhere because of our BG/Q.

We've had some problems with the RHEL-provided OFED stack interfering
with the Mellanox one. One of the systems we've experienced in the past
is some IB services (like RDMA) work wile others (like IPoIB) don't.
Using the Mellanox install script in the MLNX OFED package clears this
up. I wonder if this is what's going on?

Skylar

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Python libraries slow to load across Scyld cluster

2015-01-17 Thread Skylar Thompson
On 01/16/2015 04:38 PM, Don Kirkby wrote:
 Thanks for the suggestions, everyone. I've used them to find more 
 information, but I haven't found a solution yet. 
 
 It looks like the time is spent opening the Python libraries, but my attempts 
 to change the Beowulf configuration files have not made it run any faster. 
 
 Skylar asked: 
 
 Do any of your search paths (PATH, PYTHONPATH, LD_LIBRARY_PATH, etc.) 
 include a remote filesystem (i.e. NFS)? This sounds a lot like you're 
 blocked on metadata lookups on NFS. Using strace -c will give you a 
 histogram of system calls by count and latency, which can be helpful in 
 tracking down the problem. 
 
 Yes, the compute nodes mount from a network file system to a local RAM disk. 
 When I look at mounted file systems, I can see that the Python libraries are 
 on a network mount. The Python libraries are at /usr/local/lib/python2.7. 
 
 $ bpsh 5 df 
 Filesystem 1K-blocks Used Available Use% Mounted on 
 [...others deleted...] 
 192.168.1.1:/usr/local/lib
  926067424 797367296  80899808  91% /usr/local/lib
 
 
 I used strace as suggested and found that most of the time is spent in 
 open(). 
 
 $ bpsh 5 strace -c python2.7 cached_imports_decimal.py 
 started at 2015-01-16 14:29:45.543066 
 imported decimal at 0:00:21.719083 
 % time seconds usecs/call calls errors syscall 
 -- --- --- - -  
 97.95 0.040600 44 932 822 open 
 [...others deleted...] 
 
 
 I also looked at the timing of the individual system calls to see which files 
 were slow to open: 
 
 bpsh 5 strace -r -o strace.txt python2.7 cached_imports_decimal.py 
 more strace.txt 
 [...] 
 0.63 open(/usr/local/lib/python2.7/lib-dynload/usercustomize.so, 
 O_RDONLY) = -1 ENOENT (No such file or directory) 
 0.000701 open(/usr/local/lib/python2.7/lib-dynload/usercustomizemodule.so, 
 O_RDONLY) = -1 ENOENT (No such file or directory) 
 0.127012 open(/usr/local/lib/python2.7/lib-dynload/usercustomize.py, 
 O_RDONLY) = -1 ENOENT (No such file or directory) 
 0.126985 open(/usr/local/lib/python2.7/lib-dynload/usercustomize.pyc, 
 O_RDONLY) = -1 ENOENT (No such file or directory) 
 0.127037 stat(/usr/local/lib/python2.7/site-packages/usercustomize, 
 0x7fff28a973f0) = -1 ENOENT (No such file or directory) 
 0.86 open(/usr/local/lib/python2.7/site-packages/usercustomize.so, 
 O_RDONLY) = -1 ENOENT (No such file or directory) 
 0.126963 
 open(/usr/local/lib/python2.7/site-packages/usercustomizemodule.so, 
 O_RDONLY) = -1 ENOENT (No such file or directory) 
 [...] 

Do you have attribute caching (ac) setup for the NFS mount? Assuming
this is a mostly read-only NFS mount point, you might also consider
disabling closed-to-open cache coherence (nocto) which will
significantly increase NFS performance at the expense of breaking POSIX
compliance.

There's a good discussion of the implications in the nfs(5) man page in
the DATA AND METADATA COHERENCE section.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Python libraries slow to load across Scyld cluster

2015-01-15 Thread Skylar Thompson
Do any of your search paths (PATH, PYTHONPATH, LD_LIBRARY_PATH, etc.)
include a remote filesystem (i.e. NFS)? This sounds a lot like you're
blocked on metadata lookups on NFS. Using strace -c will give you a
histogram of system calls by count and latency, which can be helpful in
tracking down the problem.

Skylar

On 01/15/2015 03:22 PM, Don Kirkby wrote:
 I'm new to the list, so please let me know if I'm asking in the wrong place.
 
 We're running a Scyld Beowulf cluster on CentOS 5.9, and I'm trying to
 run some Django admin commands on a compute node. The problem is that it
 can take three to five minutes to launch ten MPI processes across the
 four compute nodes and the head node. (We're using OpenMPI.)
 
 I traced the delay to smaller and smaller parts of the code until I
 created two example scripts that just import a Python library and print
 out timing information. Here's the first script that imports the
 collections module:
 
 from datetime import datetime
 t0 = datetime.now()
 print 'started at {}'.format(t0)
 
 import collections
 print 'imported at {}'.format(datetime.now() - t0)
 
 
 
 When I run that with mpirun -host n0 python
 cached_imports_collections.py, it initially takes about 10 seconds to
 import. However, repeated runs take less time, until it takes less than
 0.01 seconds to import.
 
 Running an equivalent script to import the decimal module takes about 30
 seconds, and never speeds up like that. It may be too large to get
 cached completely. I ran beostatus while the script was running, and I
 didn't see the network traffic go over 100 kBps. For comparison, running
 wc on a 100MB file on a compute node causes the network traffic to go
 over 3000 kBps.
 
 I looked in the Scyld admin guide (PDF) and the reference guide (PDF),
 and found the bplib command that manages which libraries are cached and
 not transmitted with the job processes. However, its list of library
 directories already includes the Python installation.
 
 Is there some way to increase the size of the bplib cache, or am I doing
 something inefficient in the way I launch my Python processes?
 
 Thanks,
 Don Kirkby
 British Columbia Centre for Excellence in HIV/AIDS
 cfenet.ubc.ca
 
 
 ___
 Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
 To change your subscription (digest mode or unsubscribe) visit 
 http://www.beowulf.org/mailman/listinfo/beowulf
 

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Putting /home on Lusture of GPFS

2014-12-24 Thread Skylar Thompson
On 12/24/2014 06:44 AM, Michael Di Domenico wrote:
 On Tue, Dec 23, 2014 at 6:43 PM, Christopher Samuel
 sam...@unimelb.edu.au wrote:
 On 24/12/14 05:35, Michael Di Domenico wrote:

 I've always shied away from gpfs/lustre on /home and favoured netapp's
 for one simple reason.  snapshots.  i can't tell you home many times
 people have accidentally deleted a file.

 We use GPFS snapshots for our project areas already, for just that
 reason. :-)
 
 Hmmm, i haven't followed GPFS all that much since probably 2005'ish,
 is snapshots in it fairly new?  I don't recall them being there way
 back then.  Perhaps a re-evaluation is in order...

They've been supported at least since v3, not sure about before though.
One caution is that deleting snapshots is very metadata-intensive, so if
you have lots of files you'll want to consider placing your metadata on
fast storage (although probably you'll want to consider it regardless of
snapshots).

Skylar

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] SC14 website scheduling issues

2014-11-16 Thread Skylar Thompson
On 11/15/2014 11:12 PM, Novosielski, Ryan wrote:
 On Nov 16, 2014, at 00:48, Stuart Barkley stua...@4gh.net wrote:

 Another note: Just like last time SC was in New Orleans, the shuttle
 bus operator at the airport has never heard of a convention coming
 into town and seems to run the same shuttle schedule all the time.
 It was a 45 minute wait in line for the shuttle to get to the hotel.
 
 You're lucky! I arrived tonight at the same time as a lot of people did for 
 whatever other conference is here, and I was told that there would be a two 
 hour wait for the shuttle. I waited in the taxi line for about a half an hour 
 to get a taxi.

I ended up taking the bus from the airport downtown. Cost was $2, took
~50 minutes, and dropped me off two blocks from the hotel.

Skylar

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Gentoo in the HPC environment

2014-06-30 Thread Skylar Thompson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/30/2014 05:34 PM, Christopher Samuel wrote:
 On 01/07/14 10:27, Christopher Samuel wrote:
 
 then all the applications are in /usr/local
 
 To quickly qualify that, our naming scheme is:
 
 /usr/local/$application/$version-$compiler/
 
 We name modules as:
 
 $application-$compiler/$version
 
 so someone can do:
 
 module load gromacs-intel
 
 and get the latest version of Gromacs built with the Intel
 compilers.
 
 Then we can do these tricks like pull values out in relatively
 generic module files thus:
 
 [...] set ver [lrange [split [ module-info name ] / ] 1 1 ] set
 name [lrange [split [ module-info name ] / ] 0 0 ] set subname
 [lrange [split $name - ] 0 0 ] set compiler [lrange [split $name -
 ] 1 1 ]
 
 if { ![ is-loaded $compiler ] } { module load $compiler }
 
 prepend-path PATH /usr/local/$subname/$ver-$compiler/bin [...]
 
 :-)

We do something similar, although split our software up into
OS/distro/ISA (i.e. Linux/RHEL4/i686) just to keep maximum
flexibility. That path suffix is configured in the $ARCHPATH variable
so that it's not really any extra overhead to any of the tools.

One of the really cool things about Modules is the power of Tcl - it's
allowed us to make our basic Module files no more than a couple lines
long, with most of the functionality computed on the fly in a header
file based on the platform and path to the Module file. We can even
throw an error if, say, a particular piece of software isn't available
for that platform.

Skylar
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlOyMtIACgkQsc4yyULgN4aO2gCfZ6dmxoRABgh2zKqIzfq7nCiz
nCcAnjj6BFI86bIxRTk7RL6q0jBpUVn0
=EH4v
-END PGP SIGNATURE-
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Small files

2014-06-16 Thread Skylar Thompson
On 06/13/2014 01:37 PM, Lux, Jim (337C) wrote:
 I¹ve always advocated using the file system as a database: in the sense of
 ³lots of little files, one for each data blob², where ³data blob² is
 bigger than a few bytes, but perhaps in the hundreds/thousands of bytes or
 larger.
 
 1) Rather than spend time implementing some sort of database, the file
 system is already there
 2) The file system is likely optimized better for whatever platform it is
 running. It runs ³closer to the metal², and hopefully is tightly
 integrated with things like cacheing, and operating system tricks.
 3) The file system is optimized for allocation and deallocation of space,
 so I don¹t have to write that, or hope that my ³database engine of choice²
 does it right.
 4) backup and restore of parts of the data is straightforward without
 needing any special utilities(e.g. File timestamp gives mod dates, etc.)

This works well for local access, and even OK over something like NFS
for a single node's access, but for distributed access cache
invalidation (especially for metadata) gets to be a serious problem. I
work with a lot of people who write a workflow for their desktop/laptop,
then try running it successfully with a single process on a cluster.
When they try running it with hundreds of processes distributed across
many nodes, they're confused when it all falls apart.

Backup/restore can be complicated too, depending on the storage
technology. Many storage vendors assume that NDMP is the end-all for
backup technology, and provide nothing but that. My view is that NDMP is
a scam setup by storage vendors to get people to buy more storage, but
that's a discussion for another thread...

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Small files

2014-06-13 Thread Skylar Thompson
We've recently implemented a quota of 1 million files per 1TB of
filesystem space. And yes, we had to clean up a number of groups' and
individuals' spaces before implementing that. There seems to be a trend
in the bioinformatics community for using the filesystem as a database.
I think it's enabled partly by a lack of knowledge of scaling and
speedup in the community, since so much stuff still runs on laptops and
desktops. I'd really like to teach a basic scientific computing class at
work to address those concepts, but that would take more time than I
have right now.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] job scheduler and health monitoring system

2014-01-11 Thread Skylar Thompson
On 01/10/2014 12:36 PM, reza azimi wrote:
 hello guys, 
 
 I'm looking for a state of art job scheduler and health monitoring for
 my beowulf cluster and due to my research I've found many of them which
 made me confused. Can you help or recommend me the ones which are very
 hot and they are using in industry? 
 I have lm-sensors package on my servers and wanna a health monitoring
 program which record the temp as well, all I found are mainly record
 resource utilization. 
 Our workload are mainly MPI based benchmarks and we want to test some
 hadoop benchmarks in future.

Our solution with Grid Engine is to have a cron job monitoring the
contents of the IPMI SEL. If any messages are in the SEL that are not on
a whitelist, a file in /var gets generated (conversely, if no messages
are in the SEL, the file gets removed). We have a GE load sensor that
monitors for the presence of this file and places that node in an alarm
state when it sees this file, preventing new jobs from being scheduled
on the node. We then have Nagios monitoring the output of qstat -xml
on the scheduler nodes so we get notified of when a node goes into an
alarm state.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] ZFS for HPC

2013-12-22 Thread Skylar Thompson

On 12/22/2013 4:14 PM, Christopher Samuel wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Andrew,

On 26/11/13 23:10, Andrew Holway wrote:


Does checksumming save our data?

Well that will depend on whether your setup is to just detect
corruption, or be able to correct it too.



The one time we actually had corruption on-disk, ZFS was nice enough to 
tell us exactly which files were corrupted. This made it really easy to 
recover files from backups from before the corruption took place.


Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] ZFS for HPC

2013-12-22 Thread Skylar Thompson

On 12/22/2013 6:51 PM, Christopher Samuel wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/12/13 13:30, Skylar Thompson wrote:


The one time we actually had corruption on-disk, ZFS was nice
enough to tell us exactly which files were corrupted. This made it
really easy to recover files from backups from before the
corruption took place.

An interesting point, if you are running scrubs at a known interval
you could go back to the backup immediately prior to the last good scrub.

Time for me to run regular scrubs on my btrfs filesystems! :-)



Yup, we've got ours set to run over the weekend. As it happens, most of 
our drive failure reports come in on Monday.


Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] BeoBash/SC13

2013-11-10 Thread Skylar Thompson
I'll be there, along with the rest of the LittleFe crew.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Heterogeneous, intermitent beowulf cluster administration

2013-09-26 Thread Skylar Thompson
On 09/26/2013 06:25 AM, Gavin W. Burris wrote:
 Hi, Ivan.
 
 I'm a nay-sayer in this kind of scenario.  I believe your staff time,
 and the time of your lab users, is too valuable to spend on
 dual-classing desktop lab machines.

I'm with Gavin here - hardware has gotten too cheap for this to be
viable in most cases. Furthermore, too many research/computational jobs
benefit from environments distinct from ideal desktops (whether that's
core count, RAM, interconnect, or operating system) that you'll probably
only be satistfying a small subset of the job requests you receive.

Skylar

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] zfs

2013-09-14 Thread Skylar Thompson
On 9/14/2013 3:52 PM, Andrew Holway wrote:
 Hello,

 Anyone using ZFS in production? Stories? challenges? Caveats?

 I've been spending a lot of time with zfs on freebsd and have found it
 thoroughly awesome.


We have a bunch of ZFS-based storage systems, all running Solaris, and 
falling into two classes:

* Sun/Oracle hardware - We have a dozen or so X4540s. Most of these run 
Solaris 10, one runs Solaris 11 before Oracle came out and said Solaris 
11 was /not/ supported on the X4540. Older versions of Solaris 10 had 
some hardware integration problems and nasty ZFS and networking bugs, 
but the latest patch cluster has solved all of those. Solaris 11 does 
have issues, though, with hardware integration. Another issue we've had 
is that Oracle is perpetually out of spare drives - the rumor is that 
the Seagate drives Sun shipped in the X4540s have manufacturing defects 
that shorten their service life considerably, and Oracle has struggled 
to get other drives certified for the systems. We've easily lost 40 
drives in our X4540s this year alone out of 500-600 total, all Seagate. 
We've had to wait six weeks for 1TB SATA replacements, on NBD contracts.

* Dell hardware - before we rolled our current consolidated storage, we 
had a number of labs needing to buy bulk storage urgently. We ended up 
buying Dell servers and drive trays, and running Solaris 11 with ZFS. 
We've had some challenges, but for the price it definitely has worked 
out. Until we updated to the latest Solaris 11 patch cluster, we had 
some difficulty identifying failed drives. We've also had trouble with 
networking drivers, and tracking down other hardware problems like 
failing NICs causing system hangs. There definitely isn't as much 
integration between Solaris and the hardware as with real Oracle 
hardware. The good news is that the moment you say Solaris to Dell 
support they just believe whatever you tell them, without having to run 
additional diagnostics. This makes hardware repair much faster than on 
Linux or Windows systems.

We considered running FreeBSD on some of these systems, but the lack of 
enterprise support made us somewhat leery (not that Oracle support is 
all that great). Definitely if you're going Solaris make sure to get the 
latest patch cluster. In addition to the hardware-specific bugs, we also 
ran into a ZFS bug that caused it to ignore media and transport errors 
for drives even when the hardware and fmadm are reporting faults, and 
another one that would cause scrubs to hang the system.

One thing I wish we had done was buy SSDs for at least some of these 
systems, particularly the ones with lots of tiny files. ZFS metadata 
overhead is pretty high, but separating out L2ARC/ZIL onto SSD would 
have made performance much better. Live and learn, I guess...

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Strange resume statements generated for GRUB2

2013-06-10 Thread Skylar Thompson
On 06/10/2013 10:35 AM, Hearns, John wrote:
 
 
 
 Taking into account small size of my swap partition (4GB only), less
 than my RAM size,
 (I wrote about this situation in my 1st message) the hibernation image
 may not fit into swap partition. Therefore coding of -part2 (for /) in
 resume statement is preferred (right for general case).
 
 Are you SURE about that?
 I have a system sitting next to me which has a lot more RAM than swap,
 and the resume= is still set to the swap partition.
 Then again I must admit this is a desktop system, and I don't
 hibernate/suspend it, so that parameter is useless anyway.

I imagine you just need enough swap space to handle all your processes'
heap. Process text segment can be discarded outright, and buffer/cache
can be discarded once it's no longer dirty.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Prevention of cpu frequency changes in cluster nodes (Was : cpupower, acpid cpufreq)

2013-06-09 Thread Skylar Thompson
On 06/09/2013 11:35 AM, Mikhail Kuzminsky wrote:
 I installed OpenSuSE 12.3/x86-64 now. I may now say about the reasons
 why I am afraid of loading of cpufreq modules.
 
 1) I found in /var/log/messages pairs of strings about governor like
 
 [kernel] cpuidle: using governor ladder
 [kernel] cpuidle: using governor menu
 
 and strange for me
 [kernel] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
 [kernel] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)
 
 2) The presence on installed system of /sys/devices/system/cpu/cpufreq
   /sys/devices/system/cpu/cpu0/cpuidle
  directories. cpuidle directories contains state0, state1 etc
 directories w/non-empty files.
 
 3) But to prevent  cpu frequency changes I suppressed all like
 possibilities in BIOS.
 4) And I don't have (as I wrote in my previous Beowulf message)
 /sys/devices/system/cpu/cpu0/cpufreq files.
 
 Just the presence of this file is used by my /etc/init.d/cpufreq script
 as test of needs to load cpufreq kernel modules.
 
 5) lsmod says that there is no cpufreq modules loaded.
 Any comments ? Am I everywhere here right and should I ignore my afraids
 about kernel messages and presence of some
 /sys/devices/system/cpu/.. files ?

Can you use the performance governor instead? That should lock the clock
rate to the maximum supported by the hardware. Something like this at
boot would do the trick:

for CPU in $(awk -F: '$1 ~ /^processor/ {print $2}' /proc/cpuinfo);do
sudo /usr/bin/cpufreq-set -c ${CPU} -g performance
done

Depending on the system, you might be able to do this via a BIOS setting
too, by removing support for OS-set CPU clock rate.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Strange resume statements generated for GRUB2

2013-06-09 Thread Skylar Thompson
On 06/09/2013 11:37 AM, Mikhail Kuzminsky wrote:
 I have swap in sda1 and / in sda2 partitions of HDD. At installation
 of OpenSUSE 12.3 (where YaST2 is used) on my cluster node I found
 erroneous, by my opinion, boot loader (GRUB2) settings.
 
 YaST2 proposed (at installation) to use
 ... resume=/dev/disk/by-id/ata-WDC-... -part1 splash=silent ...
 
 in configuration of GRUB2. This parameters are transmitted (at linux
 loading) by GRUB2 to linux kernel. GRUB2 itself, according my
 installation settings, was installed to MBR. I changed (at installation
 stage) -part1 to -part2, but after that YaST2 restored it back to -
 part1 value !
 And after installation OpenSuSE boots successfully !
 I found (in installed OpenSuSE) 2 GRUB2 configuration files w/erroneous
 -part1 setting.
 
 I found possible interpretation of this behaviour in /var/log/messages.
 I found in this file the strings:
 [Kernel] PM: Checking hibernation image partition
 /dev/disk/by-id/ata-WDC_...-part1
 
 [Kernel] PM: Hibernation Image partition 8:1 present
 [Kernel] PM: Looking for hibernation image.
 {Kernel] PM: Image not found (code -22)
 [Kernel] PM: Hibernation Image partitions not present or could not be loaded
 
 What does it means ? The hibernation image is writing to swap partition
 ?  But I beleive that hibernation is really suppressed in my Linux
 (cpufreq kernel modules are not loaded) , and my BIOS settings do not
 allow any changes of CPU frequency. BTW, my swap partition is small (4
 GB, but RAM size is  8 GB).
 
 Which GRUB2/resume settings are really right and why they are right ?

Hibernation isn't strictly suspension - it's writing all allocated,
non-file-backed portions of memory to the paging/swap space. When the
system comes out of hibernation, it boots normally and then looks for a
hibernation image in the paging space. If it finds one, it loads that
back into system memory rather than proceeding with a regular boot. This
is in contract to system suspension, which depends on hardware support
to place CPU, memory, and other system devices into a low power state,
and wait for a signal to power things back up, bypassing the boot process.

I'm not a SuSE expert so I'm not sure what YaST is doing, but I imagine
you have to make grub changes via YaST rather than editing the grub
configs directly.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] El Reg: AMD reveals potent parallel processing breakthrough

2013-05-12 Thread Skylar Thompson
On 05/12/2013 07:07 AM, Lux, Jim (337C) wrote:
 I think that if we want people to design and fix automobile and jet
 engines, it is a wise thing to start them with lawnmower and moped engines
 first, rather than have their first hands on experience be with a
 hypersonic SCRAMjet burning hydrazine and FOOF.

 Or in this case, if you want students to learn about network topologies,
 fault tolerance, etc.: I'd rather they do it on something that fits on a
 desktop, is tangible (I can pull a cable and cause a link failure) than
 try to turn them loose managing the internet.



 If they make a mistake and screw up that Arduino, it's cheap to fix or
 replace. It gets reflashed every time you load a new program. You may
 have the best PC support organization in the world, but reloading
 someone's boot drive, or managing thin clients with net boot, is going to
 be more timeconsuming and expensive.


Hear, hear. I know a lot of people (me included) that learn better 
through failure than success. It's not until a pre-med student sees a 
dead person carved open that she really understands anatomy, and it's 
not until us HPC folks see a network of computers performing poorly or 
failing unexpectedly that we really understand all the dependencies 
between the parts.

Our job as seasoned veterans should be finding ways for beginners to 
fail cheaply (both in terms of initial impact and recovery), so that the 
failure can be a good learning experience.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] why we need cheap, open learning clusters

2013-05-12 Thread Skylar Thompson
On 05/12/2013 10:55 AM, Lux, Jim (337C) wrote:

 This is why I think things like ArduWulf or, more particularly 
 LittleFE, are valuable.  And it's also why nobody should start 
 packaging LittleFE clusters in an enclosure. Once all those mobos are 
 in a box with walls, it starts to discourage random and rapid 
 experimentation. If you put a littleFE in a sealed box with an 
 inventory tag and a breaking this seal voids warranty and the only 
 interface is the network jack or keyboard/monitor,  you might as well 
 put a modern multicore mobo in there and spin up VM instances.  In 
 this case, it's the very assembled in a garage kind of look that 
 prompts the willingness of someone to go in and make some unauthorized 
 changes, from which comes learning.

This is heartening to hear. As an aside, you might be interested to know 
that the original LittleFe actually wasn't an open-frame chassis. We hit 
upon the open-frame idea originally to reduce weight to the point where 
we weren't paying surcharges for overweight checked baggage. It so 
happens, though, that it's also an excellent educational tool, and 
engages non-CS people far more than the clusters that are behind thick 
glass windows or, even worse, aren't visible to the public at all.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] El Reg: AMD reveals potent parallel processing breakthrough

2013-05-11 Thread Skylar Thompson
On 05/11/2013 10:39 AM, Lux, Jim (337C) wrote:


 Hard to beat $19/node plus the cost of some wire and maybe a USB hub to
 talk to them all. http://www.pjrc.com/store/teensy.html
 rPi is in the same price range



 So, for, say, $200-300, you could give a student a platform with 8-10
 nodes made from off the shelf widgets that they could do work on.  At that
 price, you're in expensive textbook territory, and the student might be
 able to afford it.

 A class of 30 would only be $10k, which is down in the discretionary
 budget territory.

 You could write a library that provides MPI-like or sockets-like
 interfaces, as well.


 I don't know that you could get there with any sort of standard PC based
 scheme. I've been getting some Atom based mobos for about $90 each
 recently, but you still need to add a power supply. You'd probably boot
 off the net so you don't need a disk drive.

 And then there's the physical size issue.  Put together a cluster of 8
 mini-itx mobos and you're looking at a fairly large pile of hardware. You
 would, of course, be able to run vanilla Linux on them.  If you're using
 off the shelf stuff (I.e. Not making a 8 way ATX power supply), it's
 probably $100/node by the time you're done, so it's now a $800-1000 cost.

 That's high enough to be above the it might be fun to try threshold.

 It kind of depends on the pedagogical objectives..


Not to toot my own horn, but that sounds like LittleFe 
(http://littlefe.net). :)

Despite the low clock speed of Atom CPUs, we've had a lot of success 
using LittleFe in education - often an entire class will share one 
cluster (there's now dozens of units around the country), and the 
curricula written around them are submitted back to the community via 
CSERD. Like you said in a previous message, there's a lot to be said 
about turning a computational science and HPC into more of a web-lab 
experience.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Clustering VPS servers

2013-03-24 Thread Skylar Thompson
On 3/24/2013 10:25 AM, Geoffrey Jacobs wrote:
 On 03/24/2013 01:56 AM, Jonathan Aquilina wrote:
 What I am not understanding is the difference between using a monolithic
 style kernel with everything compiled in vs. modules. Is there a lower
 memory footprint if modules are used.
 Yes, if extraneous drivers are not loaded. You still need some resources
 to handle the initrd on bootup, but that shouldn't be a problem when
 everything is started.

 Something to read:
 http://unix.stackexchange.com/questions/65481/disadvantages-of-linux-kernel-module


IIRC, the initrd memory itself is freed after pivot_root (or 
switch_root) happens, so you only pay that overhead at bootup, not while 
the system is running. Of course, the initrd gives you plenty of other 
advantages besides driver flexibility. Diskless booting, for instance, 
would be much more complex without an initrd to give you a 
fully-functional (if you squint right, at least) UNIX environment before 
the main init process starts.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Configuration management tools/strategy

2013-01-06 Thread Skylar Thompson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/06/2013 05:38 AM, Walid wrote:
 Dear All,
 
 At work we are starting to evaluate Configuration management to be
 used to manage several diverse hpc clusters, and their diverse node
 types. I wanted to see what are other admins, and HPC users
 experience like,  the ones that we will start evaluating are
 CFEngine3, Puppet, Chef, Saltstack, ansible, and blueprint. there
 might other products that we need to evluate in partnership such as
 Foreman, spacewalk, ..etc.
 
 I would like to hear from you if you did evaluate such tools, or
 using one, or have a different strategy in keeping and maintaining
 configurations.

CFengine probably isn't a bad choice - going with something that's
well-tested and -used is helpful because it's a lot easier to get
recipes for what you need to do. The one on the list I can absolutely
recommend against is Spacewalk - we use RHN Satellite (the commercial
version of Spacewalk) and it is easily the worst configuration
management system I have ever seen.

rant
Here's some problems:

1. It's slow - everything you do has to go through Jabber, Tomcat,
Oracle, and god knows what else. Trying to schedule an action on
hundreds of systems can take minutes to accomplish, degrading
performance for everyone else.
2. It's unreliable - it can take multiple attempts to actually get a
command scheduled.
3. Its configuration file management is byzantine - rather than
letting you combined fragments of configuration files together, you
have to include whole files into channels, which are then applied in
some order to each individual system. Good luck trying to figure out
which systems have which order after the fact, or trying to update the
order in some predictable fashion.
4. Its channel management is slow and opaque - if you upload a new
package into Satellite, it has to rebuild its indices before it's
available to client systems. When this starts and when this finishes
is totally invisible to you, though, so you have no idea when the
package is actually ready to be installed.
5. Support is atrocious - Even with a paid support contract, we've had
pretty bad experiences. We've encountered serious bugs and
deficiencies in Satellite that have taken years to correct.

I could go on but I think that's sufficient.

/rant
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iEYEARECAAYFAlDpyKgACgkQsc4yyULgN4bkDQCgitajZe4fj+O0be1/eNcBwKhU
FnIAn2BWIWHlSvKhryo+2+M2Q+Do7RTf
=GXKx
-END PGP SIGNATURE-
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Maker2 genomic software license experience?

2012-11-08 Thread Skylar Thompson
On 11/08/12 02:35, Tim Cutts wrote:
 On 8 Nov 2012, at 10:10, Andrew Holway andrew.hol...@gmail.com wrote:

   
 It's all a bit academic now (ahem) as the MPI component is a Perl
 program, and Perl isn't supported on BlueGene/Q. :-(

 huh? perl mpi?

 Interpreted language? High performance message passing interface?

 confused.
 
 Welcome to the wonderful world of bioinformatics and genomics high 
 performance computing.  Didn't  you know that perl, python, ruby and java 
 are all much faster than C and FORTRAN?  Apparently it's a well-known fact, 
 and what would I, a mere system administrator, know otherwise?

 Sarcasm mode off now.

 Tim 

   
I guess if your development time is sufficiently shorter than the
equivalent compiled code, it could make sense. In Genome Sciences here
at University of Washington, the grad students are taught Python and R,
and there's a number of people who love the Python MPI bindings. We also
have some C MPI users, but it's not as popular as Python.

I supposed what you can say is, for the right application, Python MPI
certainly is faster than serial Python.

Skylar
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Maker2 genomic software license experience?

2012-11-08 Thread Skylar Thompson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/08/2012 06:10 AM, Tim Cutts wrote:
 
 On 8 Nov 2012, at 13:52, Skylar Thompson
 skylar.thomp...@gmail.com wrote:
 
 I guess if your development time is sufficiently shorter than
 the equivalent compiled code, it could make sense.
 
 This is true, and a lot of what these guys are writing is pipeline
 glue joining other bits of software together, for which scripting
 languages are perfect.  But there is an element of the to the man
 with a hammer everything looks like a nail thing going on, and
 people are writing analysis algorithms in these languages too.
 That's fine for prototyping, but once you run it in production and
 it's going to use thousands of CPU-years, it might be nice if
 occasionally the prototypes were replaced with something that could
 run in hundreds of CPU years instead.  In those cases, investing a
 few extra weeks in implementing in a harder language is
 cost-effective.
 
 In Genome Sciences here at University of Washington, the grad
 students are taught Python and R, and there's a number of people
 who love the Python MPI bindings. We also have some C MPI users,
 but it's not as popular as Python.
 
 I supposed what you can say is, for the right application, Python
 MPI certainly is faster than serial Python.
 
 Maybe, maybe not.  If the problem is embarrassingly parallel, which
 many genomics problems are, often not.  We never adopted MPI-BLAST
 at Sanger, taking an old example, because the throughput was always
 far greater running multiple independent serial BLAST jobs, at
 least in a mixed environment where the BLAST searches weren't
 terribly predictable.
 
 Plus of course, writing that MPI version of the code is much harder
 to get right than the serial version, so it goes against the
 original argument for keeping the development time short.
 
 I realise I'm playing devil's advocate here, to a great extent.
 But most genomics that I've dealt with so far is really about high
 throughput, not about short turnaround time of a single analysis
 job.  Of course there are some exceptions, and I'm making far too
 many sweeping generalisations here.
 
 Tim
 

This is definitely true. Many of the MPI jobs here are not what many
Beowulfers think of as traditional parallel jobs - they aren't tightly
coupled, instead there's one master rank that farms data-parallel jobs
out to the child ranks, and then does some post-processing when
everything is finished. It could easily be written as a gang of serial
jobs and get the same speedup (or lack of speedup - a perennial
challenge is explaining how slow disks really are).

Skylar
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCcjpIACgkQsc4yyULgN4Z0xwCgr6zrkXUAmUDrJjuwbB2y2F44
VPEAn2QzzhaLGCOFObLx9r6QHprmCekE
=m1w0
-END PGP SIGNATURE-
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Torrents for HPC

2012-06-12 Thread Skylar Thompson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/12/2012 03:42 PM, Bill Broadley wrote:
 Using MPI does make quite a bit of sense for clusters with high
 speed interconnects.  Although I suspect that being network bound
 for IO is less of a problem.  I'd consider it though, I do have
 sdr/ddr/qdr clusters around, but so far (knock on wood) not IO
 limited.  I've done a fair bit of MPI programming, but I'm not sure
 it's easy/possible to have nodes dynamically join/leave.  Worst
 case I guess you could launch a thread/process for each pair of
 peers that wanted to trade blocks and still use TCP for swapping
 metadata about what peers to connect to and block to trade.

We manage this by having users run this in the same Grid Engine
parallel environment they run their job in. This means they're
guaranteed to run the sync job on the same nodes their actual job runs
on. The copied files change so slowly that even on 1GbE network is
rarely a bottleneck, since we only transfer files that are changed.

Skylar
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/XxvAACgkQsc4yyULgN4b6dACfb5KIcql9wAbcudIKiO+IMrHX
xS4An1caTjSp0MOCgb4Ach6h8ynQe7CF
=LE07
-END PGP SIGNATURE-
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] rear door heat exchangers

2012-02-01 Thread Skylar Thompson
On 2/1/2012 5:20 AM, Michael Di Domenico wrote:
 On Tue, Jan 31, 2012 at 5:23 PM,  hol...@th.physik.uni-frankfurt.de wrote:
 Hi,

 We have installed a lot of racks with rear door heat exchangers but these
 are without fans instead using the in-server fans to push the air through
 the element. We are doing this with ~20kW per rack.

 How the hell are you drinking 35kW in a rack?
 
 start working with GPU's...  you'll find out real fast...

You don't even necessarily need GPUs --- our latest blade chassis suck
up 7500W in 7U going at full bore. It's pretty unpleasant standing
behind them, though.

-- 
-- Skylar Thompson (skylar.thomp...@gmail.com)
-- http://www.cs.earlham.edu/~skylar/
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Users abusing screen

2011-10-27 Thread Skylar Thompson

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/27/2011 04:37 PM, Mark Hahn wrote:
 nice 'pam_slurm' module which allows a user to login only to those nodes
 on which the said user has active jobs (allocated through slurm). The

 I think this is slightly BOFHish, too. do people actually have problems
 with users stealing cycles this way? the issue is actually stealing,
 and we simply tell our users not to steal. (actually, I don't think we
 even point it out, since it's so obvious!)

 that means we don't attempt to control (we had pam_slurm installed and
 actually removed it.) after all, just because a user's job is done, it
 doesn't mean the user has no reason to go onto that node (maybe there's a
 status file in /tmp, or a core dump or something.)

 if someone persisted in stealing cycles, we'd lock their account.


We do the equivalent with GE it if the end user requests it. We have
some clusters that need to support a mix of critical jobs supporting
data pipelines, and less-critical academic work. Our default stance,
though, is to trust our users to do the right thing. Mostly it works,
but sometimes we do need to bring out the LART stick.

- -- 
- --
- -- Skylar Thompson (sky...@cs.earlham.edu)
- -- http://www.cs.earlham.edu/~skylar/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6p7JwACgkQsc4yyULgN4aRdgCbB3er3VI9OZEVSWO0GjL15rgU
Z0sAoIZBKFsCeaYwA44uQT13JcdMN3dz
=ervm
-END PGP SIGNATURE-

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Users abusing screen

2011-10-22 Thread Skylar Thompson
On 10/22/11 05:02, Ellis H. Wilson III wrote:

 Insane?  I mean, I do a lot of work on a bunch of different distros and
 hardware types, and have found little use for screen /unless/ I was on a
 really, really poor internet connection that cut out on the minutes
 level.  Can you give some examples regarding something you can do with
 screen you cannot do with nohup and tail?

   

Here's a few I can think of:

* Multiple shells off one login
* Scroll buffer
* Copypaste w/o needing a mouse
* Start session logging at any time, w/o needing to remember to use
script or nohup

I guess I'm with Andrew, where the first thing I do upon logging in is
either connecting to an existing screen session or starting a fresh one.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] cluster scheduler for dynamic tree-structured jobs?

2010-05-15 Thread Skylar Thompson

On 05/15/10 03:24, Andrew Piskorski wrote:

Folks, I could use some advice on which cluster job scheduler (batch
queuing system) would be most appropriate for my particular needs.
I've looked through docs for SGE, Slurm, etc., but without first-hand
experience with each one it's not at all clear to me which I should
choose...

I've used Sun Grid Engine for this in the past, but the result was
very klunky and hard to maintain.  SGE seems to have all the necessary
features underneath, but no good programming API, and its command-line
tools often behave in ways that make them a poor substitute.

Here's my current list of needs/wants, starting with the ones that
probably make my use case more unusual:

1. I have lots of embarrassingly parallel tree-structured jobs which I
dynamically generate and submit from top-level user code (which
happens to be written in R).  E.g., my user code generates 10 or 100
or 1000 jobs, and each of those jobs might itself generate N jobs.
Any given job cannot complete until all its children complete.

Also, multiple users may be submitting unrelated jobs at the same
time, some of their jobs should have higher priority than others, etc.
(The usual reasons for wanting to use a cluster scheduler in the first
place, I think.)

Thus, merely assigning the individual jobs to compute nodes is not
enough, I need the cluster scheduler to also understand the tree
relationships between the jobs.  Without that, it'd be too easy to get
into a live-lock situation, where all the nodes are tied up with jobs,
none of which can complete because they are waiting for child jobs
which cannot be scheduled.
   


I'm not quite sure I understand what you're doing, but if you make all 
your execution hosts submit hosts as well you can submit jobs within 
your running jobs. You can use -now y -sync y in your jobs to ensure 
that the parent doesn't exit until its children have exited.



2. Sometimes I can statically figure out the full tree structure of my
jobs ahead of time, but other times I can't or won't, so I definitely
need a scheduler that lets me submit new sub-jobs on the fly, from any
node in the cluster.

3. The jobs are ultimately all submitted by a small group of people
who talk to each other, so I don't really care about any fancy
security, cost accounting, grid support, or other such features
aimed at large and/or loosely coupled organizations.

4. I really, really want a good API for programmably interacting with
the cluster scheduler and ALL of its features.  I don't care too much
what language the API is in as long as it's reasonably sane and I can
readily write glue code to interface it to my language of choice.
   


I haven't looked at it much, but I think DRMAA will work for that in SGE.


5. Although I don't currently do any MPI programming, I would very
much like the option to do so in the future, and integrate it smoothly
with the cluster scheduler.  I assume pretty much all cluster
schedulers have that, though.  (Erlang integration might also be nice.)
   


SGE does indeed do MPI integration. I doubt it does Erlang integration 
out of the box but the integration is just a collection of pre- and 
post-job scripts so you should be able to write it yourself if you have to.



6. Each of my individual leaf-node jobs will typically take c. 3 to 30
minutes to complete, so my use shouldn't stress the scheduler's own
performance too much.  However, sometimes I screw that up and submit
tons of jobs that each want to run for only a small amount of time,
say 2 minutes or less, so it'd be nice if the scheduler is
sufficiently efficient and low-latency to keep up with that.
   


SGE's scheduler latency is tunable to a certain degree. As you decrease 
the maximum latency you increase the load so you might need beefier 
hardware to accommodate it.



7. When I submit a job, I should be able to easily (and optionally)
give the scheduler my estimates of how much RAM and cpu time the job
will need.  The scheduler should track what resources the job ACTUALLY
uses, and make it easy for me to monitor job status for both running
and completed jobs, and then use that information to improve my
resource estimates for future jobs.  (AKA good APIs, yet again.)
   


SGE can give you this with requestable complexes, although I don't think 
it'll learn from your estimates.



8. Of course the scheduler must have a good way to track all the basic
information about my nodes:  CPU sockets and cores, RAM, etc.  Ideally
it'd also be straightforward for me to extend the database of node
properties as I see fit.  Bonus points if it uses a good database
(e.g. SQLite, PostgreSQL) and a reasonable data model for that stuff.

Thanks in advance for your help and advice!
   



SGE does this and can make it available as XML.


--
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin

Re: [Beowulf] cluster scheduler for dynamic tree-structured jobs?

2010-05-15 Thread Skylar Thompson

On 05/15/10 08:44, Andrew Piskorski wrote:

SGE does this and can make it available as XML.
 

Which reminds me, I need to look harder to figure out WHERE exactly
SGE stores its node configuration data, and how I can perhaps extend
it with additional information, like the network topology between my
nodes.  This is probably simple but it wasn't obvious from the
(voluminous) SGE docs.

   



I think it depends on whether you're using text or BDB as your backend. 
If you're using text, it'll be in $SGE_ROOT/$SGE_CELL, with 
node-specific customizations in $SGE_ROOT/$SGE_CELL/local_conf. I'm not 
sure about BDB though.



--
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] which 24 port unmanaged GigE switch?

2010-04-05 Thread Skylar Thompson
On 4/5/2010 1:27 PM, Michael Di Domenico wrote:
 A couple small 10node clusters we have setup used to routinely drop
 off the network and the switch would have to be hard reset for it to
 return.  Granted we didn't do any deep analysis (just replaced with
 cisco) and it could be attributed to some bad switches, but i've also
 seen this at home with some 1gb switches i bought.

 over the years i've been using netgear enterprise and home products,
 they are wonderful in light use 80-85% max throughput, but once you
 hit the 90+ areas they seem to start to degrade either through packet
 loss or over heating

 we still buy them for our management network, they're cheaper then hp
 and we just need it for kickstarts, snmp, etc..


   

This has been my experience too. We had a pair of managed Netgear
gigabit switches at my last job with the two GBIC as uplinks bonded
together with LACP. We probably burned out all four GBICs every year,
and although Netgear was happy to continue replacing them it was
certainly annoying.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] are compute nodes always kept in a private I/P and switch space?

2010-01-13 Thread Skylar Thompson
Rahul Nabar wrote:
 I always took it as natural to keep all compute nodes on a private
 switch and assigned them local I/P addresses. This was almost
 axiomatic for an HPC application in my mind. This way I can channel
 all traffic to the world and logins while a select login-node. Then
 firewall the login nodes carefully.

 Just today, though, on a new project the  admin said he always keeps
 his compute nodes with public I/Ps and runs individual firewalls on
 them.

 This seemed just so wrong to me in so many ways but i was curious if
 there are legitimate reasons why people might do this? Just curious.

   
I do everything I can to keep cluster nodes on a private network, with
only the head node visible on the public network. One exception I've had
to make is when storage is on a separate network. NAT doesn't do well
with CIFS/NFS so it's just easier giving the nodes fully-routeable IP
addresses.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Need some advise: Sun storage' management server hangs repeatedly

2010-01-13 Thread Skylar Thompson
Sangamesh B wrote:
 Hi HPC experts,

  I seek your advise/suggestion to resolve a storage(NAS) server'
 repeated hanging problem.

  We've a 23 nodes Rocks-5.1 HPC cluster. The Sun storage of
 capacity 12 TB is connected to a management server Sun Fire X4150
 installed with RHEL 5.3 and this server is connected to a Gigabit
 switch which provides cluster private network. The home directories on
 the cluster are NFS mounted from storage partitions across all nodes
 including the master.

This server gets hanged repeatedly. As an initial troubleshooting
 we installed Ganglia, to check network utilization. But its normal.
 We're not getting how to troubleshoot it and resolve the problem. Can
 anybode help us resolve this issue?
Is there anything amiss according to the service processor?

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] PERC 5/E problems

2009-12-31 Thread Skylar Thompson
Vlad Manea wrote:
 Hi,

 I have a PERC 5/E card installed on my frontend (Dell PE 2970) that
 will be
 used to connect a MD1000 from Dell.
 I have a problem: PERC 5/E is not showing in BIOS.  When the server is
 starting up, I cannot
 press Ctrl-R to launch the PowerEdge Expandable RAID Controller BIOS
 (the card does not display it's bios boot message saying hit ctrl-r
 for perc 5...).

 I tried a different PCI slot and riser but with no luck.

 Is out there anybody that might give a hand fixing this?

I can't remember if the Dell BIOS has this option, but some BIOSs allow
you to clear the PCI bus cache. That will trigger a full rescan of all
the cards that are attached and could get it listed in the boot process
again. If the BIOS doesn't have that option, you could try setting the
BIOS clear jumper.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] PERC 5/E problems

2009-12-31 Thread Skylar Thompson
Vlad Manea wrote:
 Thanks all for your replays,

 In the end I think I found the problem: It looks like I have the PERC
 model M778G which apparently
 does NOT do RAID (maybe some of you can confirm that :-) ). I was
 thinking (wrongly maybe...) that
 all PERC cards do RAID...
I can't speak to that card specifically, but Dell in the past did sneaky
things like calling a system RAID-capable, but in order to make it
actually do RAID you'd have to buy a hardware key or daughter card at
some inflated price.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?

2009-10-03 Thread Skylar Thompson
Rahul Nabar wrote:
 True. That's a useful feature. But that could be done by sending
 magic packets to a eth card as well, right? I say can because I
 don't have that running on all my servers but had toyed with that on
 some. I guess, just many ways of doing the same thing.
   

You could use Wake-on-LAN to turn a system on, but I don't think you can
reset or power off the system. What ever you use should give you some
authentication/authorization and hopefully encryption so that you don't
have just anyone rebooting systems. IPMI 1.5+ will do this, but
Wake-on-LAN does not.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?

2009-10-03 Thread Skylar Thompson
Rahul Nabar wrote:
 On Fri, Oct 2, 2009 at 10:13 PM, Skylar Thompson sky...@cs.earlham.edu 
 wrote:

   
 THanks Joe! I've been reading about IPMI and also talking to my vendor
 about it. Sure, our machines also have IPMI support!

 Question: What's the difference between SOL and IPMI. Is one a subset
 of the other?

   
 SOL is provided by IPMI v1.5+, so it's a part of IPMI itself.

 

 Last two days I was playing with ipmitool to connect to the machines.
 Is this the typical tool or do people have any other open source
 sugesstions to use.

 ipmitool seems to have a rich set of features for querying the BMC's
 etc. but I didn't see many SOL sections. Is ipmitool what people use
 to watch a redirected console as well or anything else? I couldn't
 find any good SOL howtos or tutorials on google other than the
 vendor-specific ones. ANy pointers?

 I did find this one from a fellow Beowulfer but this seems quite dated 
 now.

 http://buttersideup.com/docs/howto/IPMI_on_Debian.html

   
That's what we use. You'd do something like ipmitool -a -H hostname -U
username -I lanplus sol activate will do the trick.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?

2009-10-03 Thread Skylar Thompson
Rahul Nabar wrote:
 I see, thanks for disabusing me of my notion of ipmi as one
 monolithic all-or-none creature. From what you write (and my online
 reading) it seems there are several discrete parts:

 IMPI 2.0
 switched remotely accessible PDUs
 serial concentrator type system 
   

These actually are different beasts. IPMI you'll find on a host
motherboard. Switched PDUs tend to provide a telnet/ssh/web interface,
but you should also make sure you can switch outlets using SNMP to make
scripting easier. The serial concentrator is an appliance that has a
bunch of serial ports that you can connect to serial ports on your
systems. You'll ssh into the concentrator and be able to select a port
to connect to. These are really nice for switches, so you can make
disruptive changes without worrying about the network change cutting off
your telnet or ssh session to the switch.

 Correct me if I am wrong but these are all options and varying
 vendors and implementations  will offer parts or all or none of these?
 Or is it that when one says IPMI 2 it includes all these features. I
 did read online but these implementation seem vendor specific so its
 hard to translate jargon across vendors. e.g. for Dell they are called
 DRAC's etc.
   

I think IPMI defines the way different components talk to each other,
but it doesn't mandate that a given implementation use all the
components in the specification. There's just mandates for how it'll
authenticate, talk to sensors, connect to the serial port, etc if it
chooses to provide those features.


-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?

2009-10-03 Thread Skylar Thompson
Rahul Nabar wrote:
 Thanks Skylar. I just found I have bigger problems. I thought I was
 done since ipmitool did a happy make; make install.

 But nope:

 ./src/ipmitool -I open chassis status
 Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0:
 No such file or directory
 Error sending Chassis Status command


 I don't think I have the impi devices visible. From googling this
 seems a bigger project needing insertion of some kernel modules. There
 goes my weekend! :)
   

Yeah. I've run into that problem too. You do need IPMI modules loaded if
you're connecting locally over the IPMI bus. Here's the modules I see
loaded on one of my RHEL5 Dell systems:

ipmi_devintf   44753  0
ipmi_si77453  0
ipmi_msghandler72985  2 ipmi_devintf,ipmi_si

If you can't get the IPMI devices working even after loading those
modules, you might try looking at configuring your system's IPMI network
interface manually. You should be able to do this during the boot
process on any system (look for a device called Service Processor or
Baseboard Management Controller after POST and before the OS boots).
Some systems also have their own non-IPMI ways of configuring IPMI. If
you're on Dell you can use OpenManage's omconfig command-line tool.
Older x86 Sun systems like the v40z and v20z would let you key in the
network information from the front panel, while newer Sun systems let
you connect over a serial port to configure it.

-- 
-- Skylar Thompson (sky...@cs.earlham.edu)
-- http://www.cs.earlham.edu/~skylar/




signature.asc
Description: OpenPGP digital signature
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


  1   2   >