Re: [Beowulf] And wearing another hat ...

2023-11-13 Thread Christopher Samuel

On 11/13/23 09:06, Joshua Mora wrote:


Some folks trying to bypass legally government restrictions.


I'm afraid that seems to be a parody/hoax/performance art thing:

https://www.vice.com/en/article/88xk7b/del-complex-ai-training-barge

> But there’s one glaring issue: Del Complex is not a real AI company, 
and its barge is similarly fake.

>
> The first tip-off is that Del Complex describes itself as an 
“alternate reality corporation.”


All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] naming clusters

2023-03-23 Thread Christopher Samuel

On 3/23/23 3:12 pm, Prentice Bisbal via Beowulf wrote:

honestly is there any better task for a system admin than coming up with 
good hostnames?


I remember at $JOB-2 the first of our HPC systems had been all racked up 
but it wasn't until I had was in the datacentre in the CentOS 5 
installer saying "OK, now I *really* need a hostname for this machine" 
that we finally agreed what to call it.


..and that how Bruce was named.

(and no, not a Monty Python reference, but a homage to a mentor of my 
then boss who got him into HPC and who had recently died)


--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Checkpointing MPI applications

2023-03-23 Thread Christopher Samuel

On 2/19/23 10:26 am, Scott Atchley wrote:


Hi Chris,


Hi Scott!

It looks like it tries to checkpoint application state without 
checkpointing the application or its libraries (including MPI). I am 
curious if the checkpoint sizes are similar or significantly larger to 
the application's typical outputs/checkpoints. If they are much larger, 
the time to write will be higher and they will stress capacity more.


Hmm, I'm not sure (my involvement is relatively peripheral) but I think 
we want to see this used with apps that have no existing C/R mechanism. 
If you ping me directly I can point you to people who will know more 
than I on this.


We are looking at SCR for Frontier with the idea that users can store 
checkpoints on the node-local drives with replication to a buddy node. 
SCR will manage migrating non-defensive checkpoints to Lustre.


Interesting, does it really need local storage or can it be used with 
diskless systems via tricks with loopback filesystems, etc?


All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] Checkpointing MPI applications

2023-03-23 Thread Christopher Samuel

Hi Prentice,

On 2/20/23 7:46 am, Prentice Bisbal via Beowulf wrote:

Is anyone working on DMTCP or MANA going to start monitoring the 
dmtcp-forum mailing list?


Sorry I didn't get a chance to circle back here! I did raise this with 
them and they promised to reach out to you, hopefully they'll also look 
at the lists too but I know as the folks who do that side are academic 
folks they are pretty pressed for people time.


All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] Re: old sm/sgi bios

2023-03-23 Thread Christopher Samuel

On 3/23/23 11:59 am, Fischer, Jeremy wrote:


HPUX 9. Hands down.


v9? Luxury!

For my sins in the mid 90s I was part of the small team that managed a 
heterogenous UNIX network for folks doing portable compiler development, 
I think we had ~12 UNIX variants on ~8 hardware platforms (conservatively).


Worst was a tie between SCO UNIX on x86 which had yet to discover 
symbolic links and Idris on Transputer where the shutdown command would 
crash part way through so then we had to do "sync; sync; sync" and then 
press the power button.


Most amusing problem was the Ultrix/AIX NFS war where one would cause 
filesystem corruption on the other (can't remember which was the culprit 
and which was the victim). Also SGI boxes don't like sucking in cigar 
smoke and tar.


ObHPC: Transputer counted as HPC at one time, right?

All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


[Beowulf] Checkpointing MPI applications

2023-02-18 Thread Christopher Samuel

Hi all,

The list has been very quiet recently, so as I just posted something to 
the Slurm list in reply to the topic of checkpointing MPI applications I 
thought it might interest a few of you here (apologies if you've already 
seen it there).


If you're looking to try checkpointing MPI applications you may want to 
experiment with the MANA ("MPI-Agnostic, Network-Agnostic MPI") plugin 
for the DMTCP C/R effort here:


https://github.com/mpickpt/mana

We (NERSC) are collaborating with the developers and it is installed on 
Cori (our older Cray system) for people to experiment with. The 
documentation for it may be useful to others who'd like to try it out - 
it's got a nice description of how it works too which even I, as a 
non-programmer, can understand.


https://docs.nersc.gov/development/checkpoint-restart/mana/

Pay special attention to the caveats in our docs though!

I've not used it myself, though I'm peripherally involved to give advice 
on system related issues.


I'm curious if there are other methods that people are using out there 
for transparent checkpointing of MPI applications?


All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


[Beowulf] OpenMPI over libfabric (was Re: Top 5 reasons why mailing lists are better than Twitter)

2022-11-21 Thread Christopher Samuel

On 11/21/22 4:39 am, Scott Atchley wrote:

We have OpenMPI running on Frontier with libfabric. We are using HPE's 
CXI (Cray eXascale Interface) provider instead of RoCE though.


Yeah I'm curious to know if Matt's issues are about OpenMPI->libfabric 
or libfabric->RoCE ?


FWIW we're using Cray's MPICH over libfabric (also over CXI), the ABI 
portability of MPICH is really useful to us as it allows us to patch 
containers used via Shifter to replace their MPI libraries with the Cray 
ones and have their code use the HSN natively.


All the best,
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] beowulf hall of fame

2022-02-26 Thread Christopher Samuel

On 2/26/22 5:10 am, H. Vidal, Jr. wrote:


Is Don on the list any more?


I can neither confirm nor deny it. :-)

--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Question about fair share

2022-01-24 Thread Christopher Samuel

On 1/24/22 11:17 am, Tom Harvill wrote:

We use a 'fair share' feature of our scheduler (SLURM) and have our 
decay half-life (the time needed for priority penalty to halve) set to 
30 days.  Our maximum job runtime is 7 days.  I'm wondering what others 
use, please let me know if you can spare a minute.  Thank you!


We use Slurm but we don't use fairshare, instead we have a priority 
threshold which jobs have to age to before they can get a forward 
reservation on nodes (they can of course backfill before then). We 
configure things so that jobs age at ~1 priority point per minute and 
then set our QOS's so that the start time is a certain amount of time 
away from that threshold.


We also set things up so that only 2 jobs per user+account+qos 
association can age, and once one starts running the next in line will 
begin ageing.


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] SC21 Beowulf Bash Panles

2021-11-15 Thread Christopher Samuel

On 11/14/21 12:16 pm, Douglas Eadline wrote:


While there is always a bit of Beowulf snark surrounding the Bash
I wanted to mention the technical panels are looking to be
very interesting.


Enjoy! Sadly the timing doesn't work for me this year (time zones and 
family commitments) but I'm sure it'll be awesome!


--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] List archives

2021-08-16 Thread Christopher Samuel

Hi John,

On 8/16/21 12:57 am, John Hearns wrote:


The Beowulf list archives seem to end in July 2021.
I was looking for Doug Eadline's post on limiting AMD power and the 
results on performance.


Hmm, that's odd, I'll take a look tonight, thanks for the heads up!

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] RIP CentOS 8

2020-12-08 Thread Christopher Samuel

On 12/8/20 1:06 pm, Prentice Bisbal via Beowulf wrote:

I wouldn't be surprised if this causes Scientific Linux to come back 
into existence.


It sounds like Greg K is already talking about CentOS-NG (via the ACM 
SIGHPC syspro Slack):


https://www.linkedin.com/posts/gmkurtzer_centos-project-shifts-focus-to-centos-stream-activity-6742165208107761664-Ng4C

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] Re: Administrivia: Beowulf list moved to new server

2020-11-23 Thread Christopher Samuel

On 11/23/20 10:33 pm, Tony Brian Albers wrote:


What they said:


Thank you all for your kind words on and off list, really appreciated!

My next task is to invite back to the list those who got kicked off when 
our previous hosting lost its reverse DNS records and various sites 
started rejecting email from us.


Enough from me, back to the HPC talk now!

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] CentOS 8 with OpenHPC 1.3.9 available on Qlustar

2020-04-17 Thread Christopher Samuel

On 4/17/20 12:17 PM, Prentice Bisbal via Beowulf wrote:

I'm aware. I just meant to correct the announcement which stated "Slurm 
18.10.x", which is a version that never existed.


I know, I was just commenting for Roland. :-)

--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [External] CentOS 8 with OpenHPC 1.3.9 available on Qlustar

2020-04-17 Thread Christopher Samuel

On 4/17/20 10:14 AM, Prentice Bisbal via Beowulf wrote:


I think you mean Slurm 18.08.x


Just a heads up that Slurm 18.08 is no longer supported, 20.02 is the 
current release and 19.05 is now only getting security fixes from what 
I've read on the Slurm list (though some fixes have gone into their git 
repo which I've cherry picked some of for our own internal builds).


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [EXTERNAL] Re: HPE completes Cray acquisition

2019-09-27 Thread Christopher Samuel

On 9/27/19 9:19 AM, Scott Atchley wrote:


Cray: This one goes up to 10^18


ROTFL.  Sir, you win the interwebs today. ;-)

--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] [EXTERNAL] Re: HPE completes Cray acquisition

2019-09-27 Thread Christopher Samuel

On 9/27/19 7:40 AM, Lux, Jim (US 337K) via Beowulf wrote:

“A HPE company” seems sort of bloodless and corporate.  I would kind of 
hope for  something like “CRAY – How Fast Do You Want to Go?” or 
something like that to echo back to their long history of “just make it 
fast”


"Cray: this one goes up to 11"

--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


[Beowulf] HPE completes Cray acquisition

2019-09-25 Thread Christopher Samuel

Cray joins SGI as part of the HPE stable:

https://www.hpe.com/us/en/newsroom/press-release/2019/09/hpe-completes-acquisition-of-supercomputing-leader-cray-inc.html

> As part of the acquisition, Cray president and CEO Peter Ungaro, will 
join HPE as head of the HPC and AI business unit in Hybrid IT.


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Build Recommendations - Private Cluster

2019-08-21 Thread Christopher Samuel

On 8/20/19 11:03 PM, John Hearns via Beowulf wrote:

A Transputer cluster? Squ! I know John Taylor (formerly 
Meiko/Quadrics) very well.


Hah, when I was a young sysadmin we had a heterogenous network and one 
box was a Parsys transputer system running Idris.  The only UNIX system 
I've admined where the "shutdown" command would crash part way through 
and you'd need to type sync sync sync and then flip the big red switch.


The chips used to be made in the next city (Newport) along to the one I 
grew up in (Cardiff) in South Wales. :-)


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Build Recommendations - Private Cluster

2019-08-21 Thread Christopher Samuel

On 8/21/19 3:00 PM, Richard Edwards wrote:


So I am starting to see a pattern. Some combination of CentOS + Ansible + 
OpenHPC + SLURM + Old CUDA/Nvidia Drivers;-).


My only comment there would be I do like xCAT, especially with statelite 
settings so you can PXE boot a RAM disk on the nodes but still set up 
parts of the image that are writeable over NFS from the management node 
for persistent storage.  Tends to help if you have IPMI on the nodes 
though (remote power control etc).


https://xcat.org/

Best of luck!
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] software for activating one of many programs but not the others?

2019-08-20 Thread Christopher Samuel

On 8/20/19 10:40 AM, Alex Chekholko via Beowulf wrote:

Other examples include RPM or EasyBuild+Lmod or less common tools like 
Singularity or Snap/Snappy or Flatpak.


+1 for Easybuild from me.

https://easybuilders.github.io/easybuild/

There's also Spack (I really don't like the name, it's too close to a 
really offensive term for people with disabilities from the UK when I 
was in school) here:


https://spack.io/

I've not used it but I hear it's pretty good.

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Lustre on google cloud

2019-07-22 Thread Christopher Samuel

On 7/22/19 10:48 AM, Jonathan Aquilina wrote:

I am looking at 
https://cloud.google.com/blog/products/storage-data-transfer/introducing-lustre-file-system-cloud-deployment-manager-scripts


Amazon's done similar:

https://aws.amazon.com/blogs/storage/building-an-hpc-cluster-with-aws-parallelcluster-and-amazon-fsx-for-lustre/

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] flatpack

2019-07-22 Thread Christopher Samuel

On 7/21/19 7:30 PM, Jonathan Engwall wrote:

Some distros will be glad to know Flatpack will load your software 
center with working downloads.


Are you thinking of this as an alternative to container systems & tools 
like easybuild as a software delivery system for HPC systems?


How widely supported is it?

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Rsync - checksums

2019-06-17 Thread Christopher Samuel

On 6/17/19 6:43 AM, Bill Wichser wrote:

md5 checksums take a lot of compute time with huge files and even with 
millions of smaller ones.  The bulk of the time for running rsync is 
spent in computing the source and destination checksums and we'd like to 
alleviate that pain of a cryptographic algorithm.


First of all I would note that rsync only uses checksums if you tell it 
to, otherwise it just uses file times and sizes to determine what to 
transfer.


rsync is also single-threaded, so I would take a look at what was 
previously called parsync, but is now parsynfp :-)


http://moo.nac.uci.edu/~hjm/parsync/

There is the caveat there though:

# As a warning, the main use case for parsyncfp is really only
# very large data transfers thru fairly fast network connections
# (>1Gb). Below this speed, rsync itself can saturate the
# connection, so there’s little reason to use parsyncfp and in
# fact the overhead of testing the existence of and starting more
# rsyncs tends to worsen its performance on small transfers to
# slightly less than rsync alone.

Good luck!
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Containers in HPC

2019-05-24 Thread Christopher Samuel

On 5/22/19 6:10 AM, Gerald Henriksen wrote:


Paper on arXiv that may be of interest to some as it may be where HPC
is heading even for private clusters:


In case it's of interest NERSC has a page on how Shifter does containers 
and how it packs filesystems to improve performance here:


https://docs.nersc.gov/programming/shifter/overview/

That links to the Cray User Group paper and presentation from 2015, but 
the page has a more recent graph illustrating how much less time it 
takes to run the Pynamic benchmark at 4,800 ranks with Shifter versus 
normally using different filesystem options.


The reason is (as the page says):

# Shifter mounts the flattened image via a loop mount. This approach
# has the advantage of moving metadata operations (like file lookup)
# to the compute node, rather than relying on the central metadata
# servers of the parallel filesystem. Based on benchmarking using
# the pynamic benchmark, this approach greatly improves the
# performance of applications and languages like Python that rely
# heavily on loading shared libraries Fig. 2. These tests indicate
# that Shifter essentially matches the performance of a single
# docker instance running on a workstation despite the fact that
# shifter images are stored on a parallel filesystem.

Full disclosure: I'm working at NERSC now (though this all predates me!)

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Frontier Announcement

2019-05-08 Thread Christopher Samuel

On 5/8/19 10:47 AM, Jörg Saßmannshausen wrote:


As I follow these things rather loosely, my understanding was that OpenACC
should run on both nVidia and other GPUs. So maybe that is the reason why it
is a 'pure' AMD cluster where both GPUs and CPUs are from the same supplier?
IF all of that is working out and if it is really true that you can compile
and run OpenACC code on both types of GPUs, it would a be big win for AMD.


Depends on the OpenACC compiler I guess, PGI will only support nVidia 
GPUs now (someone asked them the awkward question of whether they 
support AMD GPUs in their sponsor session this morning here at the Cray 
User Group).


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Frontier Announcement

2019-05-07 Thread Christopher Samuel

On 5/7/19 1:59 PM, Prentice Bisbal via Beowulf wrote:

I agree. That means a LOT of codes will have to be ported from CUDA to 
whatever AMD uses. I know AMD announced their HIP interface to convert 
CUDA code into something that will run on AMD processors, but I don't 
know how well that works in theory. Frankly, I haven't heard anything 
about it since it was announced at SC a few years ago.


You can always try it yourself - a fun experiment might be to see if you 
can use hipify to convert a CUDA app to HIP and then build it (and see 
how much extra work is needed to finish the port if it doesn't get it 
all) and then benchmark the original against the HIP version on an 
nVidia GPU to confirm it still works correctly and if there's a 
performance penalty/gain.


https://github.com/ROCm-Developer-Tools/HIP

cheers!
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

2019-05-02 Thread Christopher Samuel

On 5/2/19 8:40 AM, Faraz Hussain wrote:

So should I be paying Mellanox to help? Or is it a RedHat issue? Or is 
it our harware vendor, HP who should be involved??


I suspect that would be set out in the contract for the HP system.

The clusters I've been involved in purchasing in the past have always 
required support requests to go via the immediate vendor and they then 
arrange to put you in contact with others where required.


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

2019-05-01 Thread Christopher Samuel

On 5/1/19 8:50 AM, Faraz Hussain wrote:


Unfortunately I get this:

root@lustwzb34:/root # systemctl status rdma
Unit rdma.service could not be found.


You're missing this RPM then, which might explain a lot:

$ rpm -qi rdma-core
Name: rdma-core
Version : 17.2
Release : 3.el7
Architecture: x86_64
Install Date: Tue 04 Dec 2018 03:58:16 PM AEDT
Group   : Unspecified
Size: 107924
License : GPLv2 or BSD
Signature   : RSA/SHA256, Tue 13 Nov 2018 01:45:22 AM AEDT, Key ID 
24c6a8a7f4a80eb5

Source RPM  : rdma-core-17.2-3.el7.src.rpm
Build Date  : Wed 31 Oct 2018 07:10:24 AM AEDT
Build Host  : x86-01.bsys.centos.org
Relocations : (not relocatable)
Packager: CentOS BuildSystem 
Vendor  : CentOS
URL : https://github.com/linux-rdma/rdma-core
Summary : RDMA core userspace libraries and daemons
Description :
RDMA core userspace infrastructure and documentation, including initscripts,
kernel driver-specific modprobe override configs, IPoIB network scripts,
dracut rules, and the rdma-ndd utility.

--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

2019-05-01 Thread Christopher Samuel

On 5/1/19 7:05 AM, Faraz Hussain wrote:


[hussaif1@lustwzb34 ~]$ sminfo
ibwarn: [10407] mad_rpc_open_port: can't open UMAD port ((null):0)
sminfo: iberror: failed: Failed to open '(null)' port '0'


Sorry I'm late to this.

What does this say?

systemctl status rdma

You should see something along the lines of:

$ systemctl status rdma
● rdma.service - Initialize the iWARP/InfiniBand/RDMA stack in the kernel
   Loaded: loaded (/usr/lib/systemd/system/rdma.service; disabled; 
vendor preset: disabled)

   Active: active (exited) since Wed 2019-05-01 03:55:02 AEST; 21h ago
 Docs: file:/etc/rdma/rdma.conf
  Process: 10355 ExecStart=/usr/libexec/rdma-init-kernel (code=exited, 
status=0/SUCCESS)

 Main PID: 10355 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/rdma.service


That should take take of loading the umad and mad kernel modules from 
memory and without that set up you'll see that sort of error.


All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


Re: [Beowulf] Large amounts of data to store and process

2019-03-15 Thread Christopher Samuel

On 3/14/19 12:30 AM, Jonathan Aquilina wrote:

I will obviously keep the list updated in regards to Julia and my 
experiences with it but the little I have looked at the language it is 
easy to write code for. Its still in its infancy as the latest version I 
believe is 1.0.1


Whilst new it has been used in anger, and at scale.  Here's a 1PF 
cosmology code written in Julia from 2017 run at $DAYJOB.


https://www.nextplatform.com/2017/11/28/julia-language-delivers-petascale-hpc-performance/

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf


[Beowulf] Application independent checkpoint/resume?

2019-03-04 Thread Christopher Samuel

Hi folks,

Just wondering if folks here have recent experiences here with 
application independent checkpoint/resume mechanisms like DMTCP or CRIU?


Especially interested for MPI uses, and extra bonus points for 
experiences on Cray. :-)


From what I can see CRIU doesn't seem to support MPI at all, and DMTCP 
only supports it over TCP/IP or (with a supplied plugin) Infiniband. Are 
those inferences true?


Any others I've missed?

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] New tools from FB and Uber

2018-10-30 Thread Christopher Samuel

On 31/10/18 5:14 am, Tim Cutts wrote:

I vaguely remember hearing about Btrfs from someone at Oracle, it 
seems the main developer has moved around a bit since!


Yeah Chris Mason (and another) left Oracle for Fusion-IO in 2012 and
then shifted from there to Facebook in late 2013.  Jens Axboe (the
kernel block layer maintainer) also moved from Fusio-IO to Facebook
shortly after (early 2014).

FB run btrfs on a bunch of their infrastructure which helped them find
problems at scale (from memory, I don't track it any more).

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Oh.. IBM eats Red Hat

2018-10-30 Thread Christopher Samuel

On 31/10/18 4:07 am, INKozin via Beowulf wrote:


Will Red Hat come out Blue Hat after IBM blue washing?


Best one I've heard was by Kenneth Hoste on the Easybuild Slack.

Deep Purple.

;-)

--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] An Epyc move for Cray

2018-10-30 Thread Christopher Samuel

On 31/10/18 9:39 am, Christopher Samuel wrote:

For those who haven't seen, Cray has announced their new Shasta 
architecture for forthcoming systems like NERSC-9 (the replacement for 
Edison).


Now I've seen the Cray PR it seems it might not be as closely coupled as 
it initially reads..


http://investors.cray.com/phoenix.zhtml?c=98390=irol-newsArticle=2374181

# With Shasta you can mix and match processor architectures (X86, Arm®,
# GPUs) in the same system as well as system interconnects from Cray
# (Slingshot™), Intel (Omni-Path) or Mellanox (InfiniBand®).

--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] An Epyc move for Cray

2018-10-30 Thread Christopher Samuel
For those who haven't seen, Cray has announced their new Shasta 
architecture for forthcoming systems like NERSC-9 (the replacement for 
Edison).


https://www.hpcwire.com/2018/10/30/cray-unveils-shasta-lands-nersc-9-contract/

It's interesting as Cray have jumped back to using AMD CPUs (Epyc) 
alongside nVidia GPUs, in conjunction with a new interconnect (Slingshot).


I've got to say that the name of the new NERSC system reminds me of 
people struggling to program in a particular scripting language.. ;-)


--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Poll - Directory implementation

2018-10-24 Thread Christopher Samuel

On 25/10/18 3:42 am, Tom Harvill wrote:


- what directory solution do you implement?
- if LDAP, which flavor?
- do you have any opinions one way or another on the topic?


At VLSCI we originally ran 389-DS multi-master with an LDAP server on 
each cluster management node plus another one on the system running the 
user management portal for the users.


Later I think we moved to a single central master with read-only 
replicas for reasons that escape me now.


Everything talked to at least 2 LDAP servers (compute nodes had their 
management node plus one other).


We went with 389-DS because that's what VPAC was using when I moved to 
VLSCI and we needed to get operational quickly.  At some point after 
that VPAC then moved to OpenLDAP for reasons unknown.


Generally if it ain't broke don't fix it. You need to have compelling 
reasons to introduce change, otherwise you end up with "Move quickly and 
break things!" rapidly followed by "Why is everything always broken and 
awful?"


All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] SIMD exception kernel panic on Skylake-EP triggered by OpenFOAM?

2018-09-25 Thread Christopher Samuel

On 10/09/18 11:16, Joe Landman wrote:


If you have dumps from the crash, you could load them up in the
debugger.  Would be the most accurate route to determine why that was
triggered.


Thanks Joe, after a bit of experimentation we've now successfully got a 
crash dump. It seems to confirm what I thought was the case, in that the

process is off in kernel space dealing with an APIC interrupt (a timer
in this case) when a SIMD exception gets raised.

crash> bt
PID: 138341  TASK: 9fd7eb3c6eb0  CPU: 27  COMMAND: "shuangTwoPhaseE"
 #0 [9ff02ee6bc38] machine_kexec at 938629da
 #1 [9ff02ee6bc98] __crash_kexec at 93916692
 #2 [9ff02ee6bd68] crash_kexec at 93916780
 #3 [9ff02ee6bd80] oops_end at 93f1d738
 #4 [9ff02ee6bda8] die at 9382f96b
 #5 [9ff02ee6bdd8] math_error at 9382cca8
 #6 [9ff02ee6be98] do_simd_coprocessor_error at 9382cec8
 #7 [9ff02ee6bec0] simd_coprocessor_error at 93f28c9e
 #8 [9ff02ee6bf48] apic_timer_interrupt at 93f26791
RIP: 2b1b5d406828  RSP: 7fff1f596148  RFLAGS: 0293
RAX: 05c8  RBX: 2bce  RCX: 02c979e0
RDX: 05cb  RSI: 02dcedf0  RDI: 00b9
RBP: 7fff1f5a25d8   R8: 2d00   R9: 00b4
R10:   R11: 026bcb48  R12: 9ff05c1461e8
R13:   R14: 9ff05c146200  R15: 00010082
ORIG_RAX: ff10  CS: 0033  SS: 002b

The kernel code is pretty short for it, basically in the RHEL7 kernel
it comes down to:

Are we in user space?
No?  Oh dear.
Is there a fixup registered for this address?
No?  OK, goodbye cruel world...

I've reached out to the maintainers of the arch/x86/ part of the tree
in case they had any general ideas on whether this was all the kernel
could be expected to do.  Only feedback so far is that yes this is odd,
and a query to another developer regarding whether some additional
checks that are done for when the process is in user space might be
applicable if that process has called into the kernel at that point.

My suspicion is that is the process is off doing some AVX stuff when
the timer occurs and an exception is either generated or just happens
to be delivered from the AVX unit at a bad time.

Going to see if I can persuade Easybuild to compile OpenFOAM without
AVX-512 optimisations first and try (if that doesn't fix it) turn off
different things until the problem goes away.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] SIMD exception kernel panic on Skylake-EP triggered by OpenFOAM?

2018-09-09 Thread Christopher Samuel

On 10/09/18 11:16, Joe Landman wrote:

If you have dumps from the crash, you could load them up in the 
debugger.  Would be the most accurate route to determine why that was

triggered.


Thanks Joe! Looking at our nodes I don't think we've got crash dumps
enabled, I'll see if we can get that done.

Looking at the users code there's no assembler there (all C++) so
I'm starting to think this might be the result of a compiler bug?

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] SIMD exception kernel panic on Skylake-EP triggered by OpenFOAM?

2018-09-09 Thread Christopher Samuel

Hi folks,

We've had 2 different nodes crash over the past few days with kernel
panics triggered by (what is recorded as) a "simd exception" (console
messages below). In both cases the triggering application is given as
the same binary, a user application built against OpenFOAM v16.06.

This doesn't happen every time, I can see about 28 successful runs of
the application this month (the binary was built at the end of August).

The system in question has 2 x 16C Xeon Gold 6140 Skylake-EP CPUs.

Any ideas?


--8< snip snip 8<--

2018-09-09 17:14:34 [179203.697285] simd exception:  [#1] SMP
2018-09-09 17:14:34 [179203.701527] Modules linked in: squashfs loop 
8021q garp mrp stp llc nvidia_uvm(POE) nvidia(POE) xfs skx_edac 
intel_powerclamp coretemp intel_rapl iosf_mbi irqbypass crc32_pclmul 
ghash_clmulni_intel iTCO_wdt iTCO_vendor_support rdma_ucm ib_ucm dcdbas 
aesni_intel mgag200 lrw gf128mul glue_helper ablk_helper ttm ib_uverbs 
cryptd drm_kms_helper dm_mod syscopyarea sysfillrect ib_umad sysimgblt 
fb_sys_fops drm mei_me sg ipmi_si mei lpc_ich i2c_i801 shpchp nfit 
ipmi_devintf ipmi_msghandler libnvdimm tpm_crb acpi_pad acpi_power_meter 
binfmt_misc overlay(OET) osc(OE) mgc(OE) lustre(OE) lmv(OE) fld(OE) 
mdc(OE) fid(OE) lov(OE) ko2iblnd(OE) rdma_cm iw_cm ptlrpc(OE) 
obdclass(OE) lnet(OE) libcfs(OE) ib_ipoib ib_cm sr_mod cdrom sd_mod 
crc_t10dif crct10dif_generic hfi1 rdmavt i2c_algo_bit i2c_core ahci 
crct10dif_pclmul crct10dif_common crc32c_intel libahci ib_core libata 
megaraid_sas pps_core libcrc32c [last unloaded: pcspkr]
2018-09-09 17:14:34 [179203.784359] CPU: 2 PID: 159455 Comm: 
shuangTwoPhaseE Tainted: P   OE   T 
3.10.0-862.9.1.el7.x86_64 #1
2018-09-09 17:14:34 [179203.795389] Hardware name: Dell Inc. PowerEdge 
R740/06G98X, BIOS 1.4.8 05/21/2018
2018-09-09 17:14:34 [179203.802958] task: 995c1aee8fd0 ti: 
995c1988c000 task.ti: 995c1988c000
2018-09-09 17:14:34 [179203.810539] RIP: 0010:[] 
[] apic_timer_interrupt+0x141/0x170
2018-09-09 17:14:34 [179203.819515] RSP: :995c1da46200  EFLAGS: 
00010082
2018-09-09 17:14:34 [179203.824928] RAX: 995c1988ff70 RBX: 
01a95e00 RCX: 0090
2018-09-09 17:14:34 [179203.832146] RDX:  RSI: 
995c1da46200 RDI: 995c1988ff70
2018-09-09 17:14:34 [179203.839364] RBP: 7ffd8b8ba848 R08: 
0c40 R09: 0031
2018-09-09 17:14:34 [179203.846591] R10:  R11: 
00e72148 R12: 01c4e770
2018-09-09 17:14:34 [179203.853827] R13: 0007 R14: 
011935b0 R15: 0038
2018-09-09 17:14:34 [179203.861040] FS:  2ad83f7afa00() 
GS:995c1da4() knlGS:
2018-09-09 17:14:34 [179203.869213] CS:  0010 DS:  ES:  CR0: 
80050033
2018-09-09 17:14:34 [179203.875042] CR2: 02a18000 CR3: 
0017963f8000 CR4: 007607e0
2018-09-09 17:14:34 [179203.882274] DR0:  DR1: 
 DR2: 
2018-09-09 17:14:34 [179203.889495] DR3:  DR6: 
fffe0ff0 DR7: 0400

2018-09-09 17:14:34 [179203.896714] PKRU: 5554
2018-09-09 17:14:34 [179203.899530] Call Trace:
2018-09-09 17:14:34 [179203.902065] Code: 48 39 cc 77 2f 48 8d 81 00 fe 
ff ff 48 39 e0 77 23 57 48 29 e1 65 48 8b 3c 25 78 0e 01 00 48 83 c7 28 
48 29 cf 48 89 f8 48 89 e6  a4 48 89 c4 5f 48 89 e6 65 ff 04 25 60 
0e 01 00 65 48 0f 44
2018-09-09 17:14:34 [179203.922628] RIP  [] 
apic_timer_interrupt+0x141/0x170

2018-09-09 17:14:34 [179203.929259]  RSP 
2018-09-09 17:14:34 [179203.933970] ---[ end trace 3912e5e8b3b86da4 ]---
2018-09-09 17:14:34 [179203.984039] Kernel panic - not syncing: Fatal 
exception
2018-09-09 17:14:34 [179203.989451] Kernel Offset: 0x3ca0 from 
0x8100 (relocation range: 0x8000-0xbfff)


--8< snip snip 8<--

--8< snip snip 8<--

2018-09-07 22:37:16 [201527.171417] simd exception:  [#1] SMP
2018-09-07 22:37:16 [201527.176270] Modules linked in: squashfs loop 
8021q garp mrp stp llc nvidia_uvm(POE) nvidia(POE) xfs skx_edac 
intel_powerclamp coretemp intel_rapl iosf_mbi mgag200 ttm drm_kms_helper 
irqbypass syscopyarea sysfillrect crc32_pclmul sysimgblt iTCO_wdt 
fb_sys_fops ib_ucm iTCO_vendor_support ghash_clmulni_intel rdma_ucm 
dm_mod dcdbas drm ib_uverbs aesni_intel lrw gf128mul glue_helper 
ablk_helper cryptd mei_me sg lpc_ich i2c_i801 shpchp ib_umad mei ipmi_si 
ipmi_devintf ipmi_msghandler nfit libnvdimm tpm_crb acpi_pad 
acpi_power_meter binfmt_misc overlay(OET) osc(OE) mgc(OE) lustre(OE) 
lmv(OE) fld(OE) mdc(OE) fid(OE) lov(OE) ko2iblnd(OE) rdma_cm iw_cm 
ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) ib_ipoib ib_cm sd_mod sr_mod 
cdrom crc_t10dif crct10dif_generic hfi1 rdmavt i2c_algo_bit ahci 
i2c_core crct10dif_pclmul libahci crct10dif_common crc32c_intel ib_core 
libata megaraid_sas 

Re: [Beowulf] RHEL7 kernel update for L1TF vulnerability breaks RDMA

2018-08-18 Thread Christopher Samuel

On 18/08/18 17:22, Jörg Saßmannshausen wrote:

if the problem is RMDA, how about InfiniBand? Will that be broken as 
well?


For RDMA it appears yes, though IPoIB still works for us (though ours is
OPA rather than IB Kilian reported the same).

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] emergent behavior - correlation of job end times

2018-07-24 Thread Christopher Samuel

On 25/07/18 04:52, David Mathog wrote:


One possibility is that at the "leading" edge the first job that
reads a section of data will do so slowly, while later jobs will take
the same data out of cache.  That will lead to a "peloton" sort of
effect, where the leader is slowed and the followers accelerated.
iostat didn't show very much disk IO though.


I have to admit that was my first thought too. I also started to
speculate about power saving but I couldn't see a way there for
later jobs to catch up enough.

One fun thing would be to turn HT off and set the scheduler to
run 20 jobs at a time and see if it still happens then.

Perhaps running this step with "perf record" to try and capture
profile data and then look to see if you can spot differences
across all the runs?   Not sure if there are scripts to do that,
or how easy it would be to rig up (plus of course the extra I/O
of recording the traces will perturb the system).

A very interesting problem!

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Fwd: Project Natick

2018-06-10 Thread Christopher Samuel

On 11/06/18 07:46, John Hearns via Beowulf wrote:

Stuart Midgley works for DUG? 

Yup, for over a decade.. :-)

--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Fault tolerance & scaling up clusters (was Re: Bright Cluster Manager)

2018-05-17 Thread Christopher Samuel

On 14/05/18 21:53, Michael Di Domenico wrote:

Can you expand on "image stored on lustre" part?  I'm pretty sure i 
understand the gist, but i'd like to know more.


I didn't set this part of the system up, but we have a local chroot
on the management nodes disk that we add/modify/remove things from
and then when we're happy we have a script that will sync that out
to the master copy on a Lustre filesystem.

The compute nodes boot a RHEL7 kernel with custom initrd, that
includes the necessary OPA and Lustre kernel modules & config
to get the networking working and access the Lustre filesystem,
the kernel then pivots its root filesystem from the initrd to
the master copy on Lustre via overlayfs2 to ensure the compute
node sees it as read/write but without the possibility of it
modifying the master (as the master is read-only in overlayfs2).

It's more complicated than that, but that's the gist..

Does that help?

All the best!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] cursed (and perhaps blessed) Intel microcode

2018-05-09 Thread Christopher Samuel

Hi Mark,

On 30/03/18 16:28, Chris Samuel wrote:


I'll try and nudge a person I know there on that...


They did some prodding, and finally new firmware emerged at the end of
last month.

/tmp/microcode-20180425$ iucode_tool -L intel-ucode-with-caveats/06-4f-01
microcode bundle 1: intel-ucode-with-caveats/06-4f-01
  01/001: sig 0x000406f1, pf mask 0xef, 2018-03-21, rev 0xb2c, size 
27648


note the *with-caveats* part.

The releasenote file says:

---8< snip snip 8<---

-- intel-ucode-with-caveats/ --
This directory holds microcode that might need special handling.
BDX-ML microcode is provided in directory, because it need special 
commits in

the Linux kernel, otherwise, updating it might result in unexpected system
behavior.

OS vendors must ensure that the late loader patches (provided in
linux-kernel-patches\) are included in the distribution before packaging the
BDX-ML microcode for late-loading.

---8< snip snip 8<---

Here be dragons..

Good luck!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Bright Cluster Manager

2018-05-01 Thread Christopher Samuel

On 02/05/18 06:57, Robert Taylor wrote:


It appears to do node management, monitoring, and provisioning, so we
would still need a job scheduler like lsf, slurm,etc, as well. Is
that correct?


I've not used it, but I've heard from others that it can/does supply
schedulers like Slurm, but (at least then) out of date versions.

I've heard from people who like Bright and who don't, so YMMV. :-)

--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Large Dell, odd IO delays

2018-02-14 Thread Christopher Samuel

On 15/02/18 09:26, David Mathog wrote:

Sometimes for no reason that I can discern an IO operation on this 
machine will stall.  Things that should take seconds will run for 
minutes, or at least until I get tired of waiting and kill them.

Here is today's example:

gunzip -c largeFile.gz > largeFile


Does "perf top -p ${PID}" show anything useful about where the
processes is spending its time?

Good luck!
Chris
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Dell syscfg for KNL nodes - different to regular Dell syscfg?

2018-02-13 Thread Christopher Samuel

On 14/02/18 13:38, Christopher Samuel wrote:


Now that *might* be because I'm having to (currently) run it
on a non-KNL system for testing, and perhaps it probes the
BMC to work out what options it makes sense to show me..


So yes, that appeared to be the case.

Also if you run it without a config file it complains:

# syscfg -h
Cannot stat /etc/omreg.cfg file. Please ensure /etc/omreg.cfg file is 
present and is valid for your environment. You can copy this file from 
the DTK iso.


Copying the config file from an existing node resulted in:

# syscfg -h
The required BIOS interfaces cannot be found on this system.

But if I copy that file to another location:

# mkdir -p /opt/dell/srvadmin/etc/

# cp -pvf /etc/omreg.cfg /opt/dell/srvadmin/etc/
‘/etc/omreg.cfg’ -> ‘/opt/dell/srvadmin/etc/omreg.cfg’

# syscfg -h

syscfg Version 6.1.0 (Linux - Oct 22 2017, 09:07:10)
Copyright (c) 2002-2017 Dell Inc.

[...]

# syscfg --ProcEmbMemMode
ProcEmbMemMode=cache

So looking good!

All the best,
Chris
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Dell syscfg for KNL nodes - different to regular Dell syscfg?

2018-02-13 Thread Christopher Samuel

Hi Kilian,

On 14/02/18 12:40, Kilian Cavalotti wrote:


AFAIK, despite their unfortunate sharing of the same name, Dell's
syscfg and Intel's syscfg are completely different tools:


That I understand. :-)  The problem is that the Dell syscfg doesn't
seem to have the options that Slurm thinks it should.

Now that *might* be because I'm having to (currently) run it
on a non-KNL system for testing, and perhaps it probes the
BMC to work out what options it makes sense to show me..

I'll keep digging...

cheers!
Chris
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Dell syscfg for KNL nodes - different to regular Dell syscfg?

2018-02-13 Thread Christopher Samuel

Hi all,

I'm helping bring up a cluster which includes a handful of Dell KNL
boxes (PowerEdge C6320p).  Now Slurm can manipulate the MCDRAM settings
on KNL nodes via syscfg, but Dell ones need to use the Dell syscfg and
not the Intel one.

The folks have a Dell syscfg (6.1.0) but that doesn't appear to have the
necessary flags which Slurm expects.  Is there a KNL specific version?

NB: I'm having to test the syscfg out on a non-KNL node as the nodes
are booted into an image that can't be modified on the fly.  Just want
to check I'm working on the right version before trying to extract
something that will run from a directory rather than having to install
a heap of RPMs to satisfy all the dependencies.

Google and the Dell website are not helping me, or else I'm too out of
practice after a month off.. :-)

All the best,
Chris
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-07 Thread Christopher Samuel

On 08/01/18 09:18, Richard Walsh wrote:


Mmm ... maybe I am missing something, but for an HPC cluster-specific
solution ... how about skipping the fixes, and simply requiring all
compute node jobs to run in exclusive mode and then zero-ing out user
memory between jobs ... ??


If you are running other daemons with important content (say the munge
service that Slurm uses for authentication) then you risk the user being
able to steal the secret key from the daemon.

But it all depends on your risk analysis of course.

All the best!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-07 Thread Christopher Samuel

On 07/01/18 23:22, Jörg Saßmannshausen wrote:


the first court cases against Intel have been filed:


These would have to be Meldown related then, given that Spectre is so
widely applicable.

Greg K-H has a useful post up about the state of play with the various
Linux kernel patches for mainline and stable kernels here:

http://kroah.com/log/blog/2018/01/06/meltdown-status/

He also mentioned about the Meltdown patches for ARM64:

# Right now the ARM64 set of patches for the Meltdown issue are not
# merged into Linus’s tree. They are staged and ready to be merged into
# 4.16-rc1 once 4.15 is released in a few weeks. Because these patches
# are not in a released kernel from Linus yet, I can not backport them
# into the stable kernel releases (hey, we have rules for a reason...)
#
# Due to them not being in a released kernel, if you rely on ARM64 for
# your systems (i.e. Android), I point you at the Android Common Kernel
# tree All of the ARM64 fixes have been merged into the 3.18, 4.4, and
# 4.9 branches as of this point in time.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-05 Thread Christopher Samuel

On 06/01/18 12:00, Gerald Henriksen wrote:


For anyone interested this is AMD's response:

https://www.amd.com/en/corporate/speculative-execution


Cool, so variant 1 is likely the one that SuSE has firmware for to
disable branch prediction on Epyc.

cheers,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-05 Thread Christopher Samuel

On 06/01/18 03:46, Jonathan Aquilina wrote:

Chris on a number of articles I read they are saying AMD's are not 
affected by this.


That's only 1 of the 3 attacks to my understanding. The Spectre paper says:

# Hardware. We have empirically verified the vulnerability of several
# Intel processors to Spectre attacks,  including Ivy Bridge, Haswell
# and Skylake based processors.We  have  also  verified  the
# attack’s  applicability to  AMD  Ryzen  CPUs.   Finally,  we  have
# also  successfully mounted Spectre attacks on several Samsung and
# Qualcomm processors (which use an ARM architecture) found in popular
# mobile phones.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-05 Thread Christopher Samuel

On 05/01/18 10:48, Jörg Saßmannshausen wrote:


What I would like to know is: how about compensation? For me that is
the same as the VW scandal last year. We, the users, have been
deceived.


I think you would be hard pressed to prove that, especially as it seems
that pretty much every mainstream CPU is affected (Intel, AMD, ARM, Power).


Specially if the 30% performance loss which have been mooted are not
special corner cases but are seen often in HPC. Some of the chemistry
code I am supporting relies on disc I/O, others on InfiniBand and
again other is running entirely in memory.


For RDMA based networks like IB I would suspect that the impact will be
far less as the system calls to set things up will be impacted but that
after that it should be less of an issue (as the whole idea of RDMA was
to get the kernel out of the way as much as possible).

But of course we need real benchmarks to gauge that impact.

Separating out the impact of various updates will also be important,
I've heard that the SLES upgrade to their microcode package includes
disabling branch prediction on AMD k17 family CPUs for instance.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-05 Thread Christopher Samuel

On 03/01/18 23:56, Remy Dernat wrote:


So here is me question : if this is not confidential, what will you do ?


Any system where you do not have 100% trust in your users, their
passwords and the devices they use will (IMHO) need to be patched.

But as ever this will need to be a site-specific risk assessment.

For sites running Slurm with Munge they might want to consider what
the impact of a user being able to read the munge secret key out of
memory and potentially reusing it, for instance.

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-03 Thread Christopher Samuel

On 03/01/18 19:46, John Hearns via Beowulf wrote:

I guess the phrase "to some extent" is the vital one here. Are there 
any security exploits which use this information?


It's more the fact that it reduces/negates the protection that existing
kernel address space randomisation gives you, the idea of that being to
make it harder for a wide range of exploits, known and unknown.  More
info here:

https://lwn.net/Articles/738975/

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-02 Thread Christopher Samuel

On 03/01/18 14:46, Christopher Samuel wrote:


This is going to be interesting I think...


Also looks like ARM64 may have a similar issue, a subscriber only
article on LWN points to this patch set being worked on to address the
problem there:

https://lwn.net/Articles/740393/

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Intel CPU design bug & security flaw - kernel fix imposes performance penalty

2018-01-02 Thread Christopher Samuel

Hi all,

Just a quick break from my holiday in Philadelphia (swapped forecast 40C
on Saturday in Melbourne for -10C forecast here) to let folks know about
what looks like a longstanding Intel CPU design flaw that has security
implications.

There appears to be no microcode fix possible and the kernel fix will
incur a significant performance penalty, people are talking about in the
range of 5%-30% depending on the generation of the CPU. :-(

https://www.theregister.co.uk/2018/01/02/intel_cpu_design_flaw/

There's a post on the PostgreSQL site that measures the impact, El Reg
summarises the impact as:

https://twitter.com/TheRegister/status/948342806367518720?ref_src=twsrc%5Etfw

Best case: 17% slowdown
Worst case: 23%

Here's the post about the measured impact:

https://www.postgresql.org/message-id/2018010354.qikjmf7dvnjgb...@alap3.anarazel.de

This is going to be interesting I think...

All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Openlava down?

2017-12-24 Thread Christopher Samuel

On 24/12/17 10:52, Jeffrey Layton wrote:

I remember that email. Just curious if things have progressed so 
Openlava is no longer available.


That would appear to be the case. Here is the DMCA notice from IBM
to Github:

https://github.com/github/dmca/blob/master/2016/2016-10-17-IBM.md

FWIW the Teraproc download link (before it went 404) was just a
request form for people to register to get the "enterprise edition"
version according to archive.org.

cheers!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] OpenMPI & Slurm: mpiexec/mpirun vs. srun

2017-12-18 Thread Christopher Samuel

On 19/12/17 09:20, Prentice Bisbal wrote:

What are the pros/cons of using these two methods, other than the 
portability issue I already mentioned? Does srun+pmi use a different 
method to wire up the connections? Some things I read online seem to 
indicate that. If slurm was built with PMI support, and OpenMPI was 
built with Slurm support, does it really make any difference?


Benchmark.  In (much) older versions of Slurm we would find that NAMD 
built with OpenMPI would scale better with mpirun, but in more recent 
versions (from 15.x onwards I believe) we found srun scaled better instead.


Whether that's because of differences in wireup or other issues I'm not 
sure, but currently I recommend people use srun instead.  You will also 
get better accounting information.


With mpirun OMPI will start orted on the each node (via srun) and then 
that will launch the MPI ranks.  With srun slurmd will launch the MPI 
ranks itself.


Hope this helps!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Intel kills Knights Hill, Xeon Phi line "being revised"

2017-11-19 Thread Christopher Samuel
On 19/11/17 10:40, Jonathan Engwall wrote:

> I had no idea x86 began its life as a co-processor chip, now it is not
> even a product at all.

Ah no, this was when floating point was done via a co-processor for the
Intel x86..

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Intel kills Knights Hill, Xeon Phi line "being revised"

2017-11-15 Thread Christopher Samuel
On 16/11/17 13:59, C Bergström wrote:

> I'm torn between

Knowing Stu and what he does I'll take the former over the latter.. :-)

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Intel kills Knights Hill, Xeon Phi line "being revised"

2017-11-15 Thread Christopher Samuel
On 16/11/17 12:58, Gerald Henriksen wrote:

> Maybe worth pointing out that Intel has big changes in store, which
> may or may not be a factor in the Xeon Phi future:

Might be a case of history repeating itself as Xeon Phi came out of the
work Intel did on the Larrabee discreet GPU (which they killed in 2010).

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Intel kills Knights Hill, Xeon Phi line "being revised"

2017-11-15 Thread Christopher Samuel
Interesting times (via a colleague on the Australian HPC Slack).

https://www.top500.org/news/intel-dumps-knights-hill-future-of-xeon-phi-product-line-uncertain/

Looks like fallout from the delayed Aurora system.

Rumours flying that the Xeon Phi family is in jeopardy, but the
article has an addendum to say:

# [Update: Intel denies they are dropping the Xeon Phi line,
# saying only that it has "been revised based on recent
# customer and overall market needs."]

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Killing nodes with Open-MPI?

2017-11-05 Thread Christopher Samuel
On 26/10/17 22:42, Chris Samuel wrote:

> I'm helping another group out and we've found that running an Open-MPI 
> program, even just a singleton, will kill nodes with Mellanox ConnectX 4 and 
> 5 
> cards using RoCE (the mlx5 driver).   The node just locks up hard with no 
> OOPS 
> or other diagnostics and has to be power cycled.

It was indeed a driver bug, and is now fixed in Mellanox OFED 4.2 (which
came out a few days ago).

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] slow mpi init/finalize

2017-10-17 Thread Christopher Samuel
On 18/10/17 01:59, Michael Di Domenico wrote:

> i think i can safely say at this point it's probably not hardware
> related, but something went wonky with openmpi.  i downloaded the new
> version 3 that was released, i'll see if that fixes anything.

You are building Open-MPI with the config option:

--with-verbs

to get it to enable IB support?

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] slow mpi init/finalize

2017-10-15 Thread Christopher Samuel
On 12/10/17 01:12, Michael Di Domenico wrote:

> i'm seeing issues on a mellanox fdr10 cluster where the mpi setup and
> teardown takes longer then i expect it should on larger rank count
> jobs.  i'm only trying to run ~1000 ranks and the startup time is over
> a minute.  i tested this with both openmpi and intel mpi, both exhibit
> close to the same behavior.

What wire-up protocol are you using for your MPI in your batch system?

With Slurm at least you should be looking at using PMIx or PMI2 (PMIx
needs Slurm to be compiled against it as an external library, PMI2 is a
contrib plugin in the source tree).

Hope that helps..
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] mpi alltoall help

2017-10-10 Thread Christopher Samuel
On 11/10/17 02:58, Michael Di Domenico wrote:

> i'm getting stuck trying to run some fairly large IMB-MPI alltoall
> tests under openmpi 2.0.2 on rhel 7.4

Did this work on RHEL 7.3?

I've heard rumours of issues with RHEL 7.4 and OFED.

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Intel/Cray Aurora system pushed back a few years, expanded to 1PF

2017-10-04 Thread Christopher Samuel
Interesting news about the Intel/Cray part of the CORAL purchase, the
Aurora system destined for ANL has been delayed by 3 years to 2021 and
expanded out to an exaflop system.

https://www.top500.org/news/retooled-aurora-supercomputer-will-be-americas-first-exascale-system/

Rumours are that Knights Hill (the followup to KNL) might be the cause
of the delay...

All the best,
Chris (back in Melbourne)
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] What is rdma, ofed, verbs, psm etc?

2017-09-23 Thread Christopher Samuel
On 22/09/17 04:14, Alex Chekholko wrote:

> I don't know about RoCE but here is plain 10G ICMP ECHO round trip:

What does a 0 byte MPI ping-pong look like?

From memory (I'm at Berkeley at the moment) with RoCE and Mellanox
100gigE switches they provide slightly better (lower) latency than our
circa 2013 FDR14 Infiniband cluster.

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Administrivia: List admin away at Slurm User Group next week

2017-09-20 Thread Christopher Samuel
Hi folks,

In a few hours I'm going to be travelling to the Slurm User Group at
NERSC (Berkeley, CA, USA) and then taking a little personal time
afterwards and so I won't be back in Australia until 3rd October (in
body at least, uncertain when my brain will catch up with me).

Whilst I'm at SLUG I will keep a weather eye on my email, but after that
I'm not going to be looking at emails until home again so the list will
run on autopilot at that point until I'm back.

If you're at SLUG feel free to introduce yourself, I'm really bad with
names and faces so you might need to do that a few times before it
finally sinks in. :-)

Take care all,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] What is rdma, ofed, verbs, psm etc?

2017-09-19 Thread Christopher Samuel
Great explanations Peter.

On 20/09/17 02:24, Peter Kjellström wrote:

> ofed: a software distribution of network drivers, libraries,
> utilities typically used by users/applications to run on Infiniband
> (and other networks supported by ofed).

To expand on that slightly, this also includes (to add to the acronym
soup) RoCE - RDMA over Converged Ethernet - in other words using
Ethernet networks (with appropriate switches) to do the sort of RDMA
that you can do over Infiniband.

This is important as unlike Infiniband you can't do RoCE out of the box
with something like RHEL7 (at least in the experience of the folks I'm
helping out here).

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] slurm in heterogenous cluster

2017-09-19 Thread Christopher Samuel
On 18/09/17 23:11, Mikhail Kuzminsky wrote:

> Thank you very much !
> I hope than modern major slurm versions will be succesfully translated
> and builded also w/old Linux distributions
> (for example, w/2.6 kernel).

We run Slurm 16.05.8 on RHEL6 (2.6.32 base) without issue.

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] slurm in heterogenous cluster

2017-09-18 Thread Christopher Samuel
Hi Mikhail,

On 18/09/17 15:41, Mikhail Kuzminsky wrote:

> Is it possible to use diffenent slurm versions on different worker nodes
> of cluster (w/other slurmctld and slurmdbd versions on head node) ? If
> this is possible in principle (to use different slurmd versions on
> different worker nodes), what are the most important restrictions for this?

The best info is in the "Upgrading" section of the Slurm quickstart guide:

https://slurm.schedmd.com/quickstart_admin.html

# Slurm daemons will support RPCs and state files from the two
# previous minor releases (e.g. a version 16.05.x SlurmDBD will
# support slurmctld daemons and commands with a version of 16.05.x,
# 15.08.x or 14.11.x). [...]
#
# If the SlurmDBD daemon is used, it must be at the same or higher
# minor release number as the Slurmctld daemons. In other words,
# when changing the version to a higher release number (e.g from
# 16.05.x to 17.02.x) always upgrade the SlurmDBD daemon first.
# [...]
#
# The slurmctld daemon must also be upgraded before or at the same
# time as the slurmd daemons on the compute nodes. Generally,
# upgrading Slurm on all of the login and compute nodes is
# recommended, although rolling upgrades are also possible
# (i.e. upgrading the head node(s) first then upgrading the
# compute and login nodes later at various times). Also see
# the note above about reverse compatibility. [...]

I don't know how well you'd go with differing versions across compute
nodes, I'd suggest if you are going to do that you have a partition per
version as I would guess an older version will not like talking to a new
version.

So basically you could have (please double check this!):

slurmdbd: 17.02.x
slurmctld: 17.02.x
slurmd: 17.02.x & 16.05.x & 15.08.x

or:

slurmdbd: 17.02.x
slurmctld: 16.05.x
slurmd: 16.05.x & 15.08.x

or:

slurmdbd: 16.05.x
slurmctld: 16.05.x
slurmd: 16.05.x & 15.08.x

or:

slurmdbd: 16.05.x
slurmctld: 15.08.x
slurmd: 15.08.x

or:

slurmdbd: 15.08.x
slurmctld: 15.08.x
slurmd: 15.08.x


Good luck!
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-17 Thread Christopher Samuel
On 15/09/17 04:45, Prentice Bisbal wrote:

> I'm happy to announce that I finally found the cause this problem: numad.

Very interesting, it sounds like it was migrating processes onto a
single core over time!  Anything diagnostic in its log?

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-13 Thread Christopher Samuel
On 14/09/17 03:48, Prentice Bisbal wrote:

> What software configuration, either a kernel a parameter, configuration
> of numad or cpuspeed, or some other setting, could affect this?

Hmm, how about diff'ing "sysctl -a" between the systems too?

Does one load new CPU microcode in whereas another doesn't?

Still curious to know if there are any major differences between dmesg
between the boxes.

For monitoring CPU settings I tend to use "cpupower monitor", here's an
example from one of our SandyBridge boxes.

# cpupower monitor
  |Nehalem|| SandyBridge|| Mperf
PKG |CORE|CPU | C3   | C6   | PC3  | PC6  || C7   | PC2  | PC7  || C0   | Cx   
| Freq
   0|   0|   0|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3100
   0|   1|   1|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3100
   0|   2|   2|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3099
   0|   3|   3|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3100
   0|   4|   4|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3100
   0|   5|   5|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100
   0|   6|   6|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3100
   0|   7|   7|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.98|  
0.02|  3100
   1|   0|   8|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100
   1|   1|   9|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100
   1|   2|  10|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100
   1|   3|  11|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3099
   1|   4|  12|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100
   1|   5|  13|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100
   1|   6|  14|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3099
   1|   7|  15|  0.00|  0.00|  0.00|  0.00||  0.00|  0.00|  0.00|| 99.99|  
0.01|  3100

...and for a Haswell box:

[root@snowy001 ~]# cpupower monitor
  |Nehalem|| Mperf
PKG |CORE|CPU | C3   | C6   | PC3  | PC6  || C0   | Cx   | Freq
   0|   0|   0|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   1|   1|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   2|   2|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   3|   3|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   4|   4|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   5|   5|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   6|   6|  0.00|  0.00|  0.00|  0.00|| 99.95|  0.05|  2503
   0|   7|   7|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   8|   8|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|   9|   9|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|  10|  10|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|  11|  11|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|  12|  12|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|  13|  13|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|  14|  14|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   0|  15|  15|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|   0|  16|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|   1|  17|  0.00|  0.00|  0.00|  0.00|| 99.58|  0.42|  2503
   1|   2|  18|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|   3|  19|  0.00|  0.00|  0.00|  0.00|| 99.58|  0.42|  2503
   1|   4|  20|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|   5|  21|  0.00|  0.00|  0.00|  0.00|| 99.57|  0.43|  2503
   1|   6|  22|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|   7|  23|  0.00|  0.00|  0.00|  0.00|| 99.57|  0.43|  2503
   1|   8|  24|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|   9|  25|  0.00|  0.00|  0.00|  0.00|| 99.58|  0.42|  2503
   1|  10|  26|  0.00|  0.00|  0.00|  0.00|| 99.95|  0.05|  2503
   1|  11|  27|  0.00|  0.00|  0.00|  0.00|| 99.58|  0.42|  2503
   1|  12|  28|  0.00|  0.00|  0.00|  0.00|| 99.95|  0.05|  2503
   1|  13|  29|  0.00|  0.00|  0.00|  0.00|| 99.57|  0.43|  2503
   1|  14|  30|  0.00|  0.00|  0.00|  0.00|| 99.94|  0.06|  2503
   1|  15|  31|  0.00|  0.00|  0.00|  0.00|| 99.58|  0.42|  2503


cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Varying performance across identical cluster nodes.

2017-09-10 Thread Christopher Samuel
On 09/09/17 04:41, Prentice Bisbal wrote:

> Any ideas where to look or what to tweak to fix this? Any idea why this
> is only occuring with RHEL 6 w/ NFS root OS?

No ideas, but in addition to what others have suggested:

1) diff the output of dmidecode between 4 nodes, 2 OK and 2 slow to see
what differences there are in common (if any) between the OK & slow
nodes.  I would think you would only see serial number and UUID
differences (certainly that's what I see here for our gear).

2) reboot an idle OK and slow node node and immediately capture the
output of dmesg on both and then diff that.  Hopefully that will reveal
any differences in kernel boot options, driver messages, power saving
settings, etc, that might be implicated.

Good luck!
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] RAID5 rebuild, remount with write without reboot?

2017-09-10 Thread Christopher Samuel
On 07/09/17 04:33, mathog wrote:

> Is there a difference between "mount -o remount" and "mount -a" if the
> partions/logical volumes are already mounted "ro"? 

I think mount -a will only try to mount filesystems that are not already
mounted.

 -a, --all
 Mount all filesystems (of the given types) mentioned in fstab.

[...]

  remount
 Attempt  to remount an already-mounted filesystem.  This is
 commonly used to change the mount flags for a filesystem,
 especially to make a readonly filesystem writeable.

> At this point I have that just "happy to be walking away"
> feeling about the whole incident.

+1 :-)

Glad to hear you survived..

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] cluster deployment and config management

2017-09-05 Thread Christopher Samuel
On 05/09/17 15:24, Stu Midgley wrote:

> I am in the process of redeveloping our cluster deployment and config
> management environment and wondered what others are doing?

xCAT here for all HPC related infrastructure.  Stateful installs for
GPFS NSD servers and TSM servers, compute nodes are all statelite, so a
immutable RAMdisk image is built on the management node for the compute
cluster and then on boot they mount various items over NFS (including
the GPFS state directory).

Nothing like your scale, of course, but it works and we know if a node
has booted a particular image it will be identical to any other node
that's set to boot the same image.

Healthcheck scripts mark nodes offline if they don't have the current
production kernel and GPFS versions (and other checks too of course)
plus Slurm's "scontrol reboot" lets us do rolling reboots without
needing to spot when nodes have become idle.

I've got to say I really prefer this to systems like Puppet, Salt, etc,
where you need to go and tweak an image after installation.

For our VM infrastructure (web servers, etc) we do use Salt for that. We
used to use Puppet but we switched when the only person who understood
it left.  Don't miss it at all...

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Supercomputing comes to the Daily Mail

2017-08-14 Thread Christopher Samuel
On 15/08/17 03:12, Jeffrey Layton wrote:

> A friend of mine, Mark Fernandez, is the lead engineer on this 
> project. He works for SGI (now HPE). They are putting two servers 
> onto the ISS and are going to be running tests for a while. I don't 
> know too many details except this.

Ars Technica had more on this last weekend, which I tweeted.

https://arstechnica.com/science/2017/08/spacex-is-launching-a-supercomputer-to-the-international-space-station/

Two 1TF systems, one to go to the ISS and one to remain on
the ground as a control system, both running the same code.

# For the year-long experiment, astronauts will install the computer 
# inside a rack in the Destiny module of the space station. It is
# about the size of two pizza boxes stuck together. And while the
# device is not exactly a state-of-the-art supercomputer—it has a
# computing speed of about 1 teraflop—it is the most powerful computer
# sent into space. Unlike most computers, it has not been hardened for
# the radiation environment aboard the space station. The goal is to
# better understand how the space environment will degrade the
# performance of an off-the-shelf computer.
# 
# During the next year, the spaceborne computer will continuously run
# through a set of computing benchmarks to determine its performance
# over time. Meanwhile, on the ground, an identical copy of the
# computer will run in a lab as a control.

No details on the actual systems there though.

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] How to debug slow compute node?

2017-08-13 Thread Christopher Samuel
On 12/08/17 17:35, William Johnson wrote:

> This may be a long shot, especially in a server room where everything
> else is working as expected.

Oh agreed! But given people have covered a lot of other bases I thought
I'd throw something in from my own experience.  If all nodes boot the
same OS image then you'd not expect the kernel command lines etc to
differ, but the UEFI settings might (depending on how they are
configured usually).

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] How to debug slow compute node?

2017-08-13 Thread Christopher Samuel
On 14/08/17 08:17, Lachlan Musicman wrote:

> Can you point to some good documentation on this?

There is some on Mellanox's website:

http://www.mellanox.com/related-docs/prod_software/Mellanox_EN_for_Linux_User_Manual_v2_0-3_0_0.pdf

But it it took weeks for $VENDOR to figure out what was
going on and why performance was so bad. It wasn't until
they got Mellanox into the calls that Mellanox pointed
this out to them.

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] mlx 10g ethernet

2017-08-06 Thread Christopher Samuel
On 05/08/17 03:49, Michael Di Domenico wrote:

> so given that qperf seems to agree with iperf, i guess it's an
> interesting question now why, lustre lnet_selftest and IMB sendrecv
> seem throttled at 500MB/sec

Is this over TCP/IP or using RoCE (RDMA over Converged Ethernet) ?

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Cluster Hat

2017-08-06 Thread Christopher Samuel
On 05/08/17 06:03, Gus Correa wrote:

> There is a long 2015 thread on building Open MPI on Raspberry Pi2
> in the Open MPI mailing list.
> Not conclusive, apparently not successful,

I contacted Paul Hargrove from the OMPI devel list who does a lot of
testing on many architectures (including RPi) about his recent
experiences with RPi testing there and he wrote back saying:

# I test Open MPI release candidates on my Raspberry Pi.
# To the best of my knowledge the 1.10, 2.0, 2.1 and
# (pending) 3.0 branches all work.
# I am not using any special configure arguments.
#
# This is with Raspbian (Debian Jessie), in case that
# makes a difference.

Hope that helps..

Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] How to know if infiniband network works?

2017-08-02 Thread Christopher Samuel
On 03/08/17 02:44, Faraz Hussain wrote:

> I have inherited a 20-node cluster that supposedly has an infiniband
> network. I am testing some mpi applications and am seeing no performance
> improvement with multiple nodes.

As you are using Open-MPI you should be able to tell it to only use IB
(and fail if it cannot) by doing this before running the application:

export OMPI_MCA_btl=openib,self,sm

Out of interest, are you running it via a batch system of some sort?

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Hyperthreading and 'OS jitter'

2017-08-01 Thread Christopher Samuel
On 02/08/17 13:37, Evan Burness wrote:

> Thanks for the history lessons, Chris! Very interesting indeed.

My pleasure, to add to the history here's a paper from the APAC'05
conference 12 years ago that details how the then APAC (now NCI) set up
their SGI Altix cluster, including a discussion on cpusets.

http://www.kev.pulo.com.au/publications/apac05/apac05-apacnf-altix.pdf

Also includes an interesting section on dealing with SGI's proprietary
MPI stack and the problems it caused them.

> Would be interesting to take it a step further and measure what the
> impacts (good, bad, or otherwise) of picking a specific core on a given
> CPU uArch layout for the OS.

Sadly I was hoping that document would give some indication of the
benefits of reducing jitter via cpusets, but it does not.

I'd be very interested to hear what people have found there - I do know
that Slurm allows you to reserve cores to generic resources like GPUs so
that an administrator can enforce that only certain cores can access
that resource (say the cores closest to a GPU).

https://slurm.schedmd.com/gres.html

It also supports "core specialisation" which is nebulously explained as:

https://slurm.schedmd.com/core_spec.html

# Core specialization is a feature designed to isolate system overhead
# (system interrupts, etc.) to designated cores on a compute node. This
# can reduce applications interrupts ranks to improve completion time.
# The job will be charged for all allocated cores, but will not be able
# to directly use the specialized cores.

Usefully there is a PDF from the 2014 Slurm User Group which goes into
more details about it, and includes references to work done by Cray and
others into the issues about jitter and benefits from reducing it.

https://slurm.schedmd.com/SUG14/process_isolation.pdf

From that description it appears to only put the Slurm daemons for jobs
into the group, but of course there would be nothing to stop you having
a start up script that moved any other existing processes onto that core
first via their own cgroup.

Shame that Bull's test was too small to show any benefit!

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Hyperthreading and 'OS jitter'

2017-08-01 Thread Christopher Samuel
On 26/07/17 00:31, Evan Burness wrote:

> If I recall correctly, IBM did just what you're describing with the
> BlueGene CPUs. I believe those were 18-core parts, with 2 of the cores
> being reserved to run the OS and as a buffer against jitter. That left a
> nice, neat power-of-2 amount of cores for compute tasks.

Close, but the 18 cores were for yield, with 1 core of running the
Compute Node Kernel (CNK) and 16 cores for the task that the CNK would
launch. The 18th was inaccessible.

But yes, I think SGI (RIP) pioneered this on Intel with their Altix
systems and was the reason they wrote the original cpuset code in the
Linux kernel so they could constrain a set of cores for the boot
services and the rest were there to run jobs on.

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Administrivia: emergency Mailman work for the Beowulf list

2017-07-04 Thread Christopher Samuel
Hi all,

The emergency work detailed below is now completed and 493 subscribers
have had their email re-enabled after it was disabled by bounces.

Users sending from sites that use DMARC will now have their From: header
rewritten to look like it comes to the list to stop this happening in
future, sorry about that.

All the best,
Chris

On 05/07/17 10:07, Christopher Samuel wrote:

> Hi all,
> 
> Someone sent a message from a site that uses DMARC last night and
> consequently almost 500 subscribers had their subscriptions suspended as
> DMARC requires sites to break the well understood semantics of the From:
> header by rewriting it.
> 
> I'm about to install the latest backport of Mailman on beowulf.org which
> will then allow me to automatically reject any DMARC'd emails to the
> list as a quick fix.
> 
> Then I'll need to work out how to get the folks who got dropped back
> onto the list. :-/
> 
> Sorry about this,
> Chris
> 


-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Administrivia: emergency Mailman work for the Beowulf list

2017-07-04 Thread Christopher Samuel
On 05/07/17 10:07, Christopher Samuel wrote:

> I'm about to install the latest backport of Mailman on beowulf.org which
> will then allow me to automatically reject any DMARC'd emails to the
> list as a quick fix.

I've now completed this work, though with a slight modification.

Instead of blocking such emails I've instead told Mailman to rewrite the
From: address in the interests of keeping communications open.

Mailman *should* put their original email address into the Reply-To:
header instead so you should still be able to get back to the original
poster.

Thank you for your patience..

All the best,
Chris (now with 493 subscribers to fix up)
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Administrivia: emergency Mailman work for the Beowulf list

2017-07-04 Thread Christopher Samuel
Hi all,

Someone sent a message from a site that uses DMARC last night and
consequently almost 500 subscribers had their subscriptions suspended as
DMARC requires sites to break the well understood semantics of the From:
header by rewriting it.

I'm about to install the latest backport of Mailman on beowulf.org which
will then allow me to automatically reject any DMARC'd emails to the
list as a quick fix.

Then I'll need to work out how to get the folks who got dropped back
onto the list. :-/

Sorry about this,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] BeeGFS usage question

2017-06-28 Thread Christopher Samuel
Hi all,

A few folks were chatting about HPC distributed filesystems over on the
Australian HPC sysadmin Slack and the question arose about whether
anyone is using BeeGFS for non-scratch (persistent) storage.

So, is anyone doing that?

Also, is anyone doing that with CephFS too?

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Register article on Epyc

2017-06-21 Thread Christopher Samuel
On 21/06/17 22:39, John Hearns wrote:

> I would speculate about single socket AMD systems, with a smaller form
> facotr motherboard, maybe with onboard Infiniband.  Put a lot of these
> cards in a chassis and boot them disklessly and you get a good amoutn of
> compute power.

I thought it interesting that the only performance info in that article
for Epyc were SpecINT and (the only mention for SpecFP was for Radeon).

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Heads up - Stack-Clash local root vulnerability

2017-06-21 Thread Christopher Samuel
On 22/06/17 06:54, mathog wrote:

> Most end user code would not need to be recompiled, since it does not
> run with privileges.

Ah, that's a very interesting point, the advisory doesn't explicitly
mention it but of course all the CVE's for applications (Exim, sudo, su,
at, etc) relate to to setuid binaries, plus this one:

- a local-root exploit against ld.so and most SUID-root binaries
  (CVE-2017-1000366, CVE-2017-1000379) on amd64 Debian, Ubuntu, Fedora,
  CentOS;

So yes, you are quite right, this (currently) doesn't seem like
something you need to worry about with users own codes being copied onto
the system or containers utilised through Shifter and Singularity which
exist to disarm Docker containers.

Phew, thanks so much for pointing that out! :-)

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Heads up - Stack-Clash local root vulnerability

2017-06-21 Thread Christopher Samuel
On 22/06/17 01:55, Kilian Cavalotti wrote:

> Thanks for starting the discussion here.

Pleasure!

> We're pretty much in the same boat (no changes made yet), as:
> 1. we're still running some RHEL 6.x based clusters, with x < 9,
> meaning no patches for neither the kernel nor glibc,

Ah yes, that's an interesting situation.  We're on RHEL 6.9 for our
systems currently and I plan to upgrade a test cluster and see if
anything I know how to run breaks.

> 2. those kernel+glibc patches seem to just be "mitigations" and don't
> solve the underlying problem anyway
> (cf.https://access.redhat.com/security/vulnerabilities/stackguard#magicdomid15)

Unfortunately I think you have to rely on those mitigations as an
attacker with local access could just bring on a statically linked
executable and you're hosed.

> Oh, and containers...

Yes, a double edged sword, lots more vulnerable software that will never
get an update.. :-/

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Heads up - Stack-Clash local root vulnerability

2017-06-21 Thread Christopher Samuel
On 22/06/17 02:09, Kilian Cavalotti wrote:

> And when exploits are released, which Qualys said they will do, all
> hell will break loose, because the "skills" part will go away...

Qualsys exploits are PoC's, but apparently there are active exploits of
this out in the wild already (and possibly predate Qualsys identifying
this) from what I've heard. :-(

Qualsys's PoC's are due to drop next Tuesday (timezone unclear).

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] Heads up - Stack-Clash local root vulnerability

2017-06-20 Thread Christopher Samuel
On 21/06/17 10:21, Christopher Samuel wrote:

> I suspect in those cases you have to rely entirely on the kernel
> mitigation of increasing the stack guard gap size.

I'm now seeing indications this kernel change can break some
applications (we've not touched our HPC systems yet).

https://community.ubnt.com/t5/UniFi-Wireless/Unifi-Controller-and-Debian-8-kernel-upgrade/m-p/1967867#M233927

Be careful..

-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


[Beowulf] Heads up - Stack-Clash local root vulnerability

2017-06-20 Thread Christopher Samuel
Hi all,

In the interest of being a good citizen there's a new local root
vulnerability for Linux, *BSD and Solaris.

https://blog.qualys.com/securitylabs/2017/06/19/the-stack-clash

# The Stack Clash is a vulnerability in the memory management of
# several operating systems. It affects Linux, OpenBSD, NetBSD,
# FreeBSD and Solaris, on i386 and amd64.  It can be exploited
# by attackers to corrupt memory and execute arbitrary code.

They list links to various distros information on the issue.

For instance RHEL have released both kernel and glibc updates, but of
course that begs the question of statically linked binaries (yes, I
know, don't do that, but they are common) and containers such as Shifter
& Singularity with older glibc's.

I suspect in those cases you have to rely entirely on the kernel
mitigation of increasing the stack guard gap size.

cheers,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] LXD containers for cluster services and cgroups?

2017-06-20 Thread Christopher Samuel
On 17/06/17 16:59, remy.dernat wrote:

> I am curious about how do you encapsulate the job in the right cgroups
> in slurm. Could you please give us some details ?

This is natively supported in Slurm.

https://slurm.schedmd.com/cgroups.html

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


Re: [Beowulf] GPFS and failed metadata NSD

2017-05-21 Thread Christopher Samuel
On 01/05/17 21:40, John Hearns wrote:

> Also remember that pairs of disks probably came off the production line
> as similar times. So this is probabyl a twins paradox!

At my previous HPC gig we lost 2 drives in a RAID-5 array within a few
minutes of each other. They were manufactured on the same day.

Fortunately the vendor (IBM) was able to coax one of the drives back
into life so we didn't need to go to the (tested) backups of that data.

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 Melbourne Bioinformatics - The University of Melbourne
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545

___
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf


  1   2   3   4   5   >