Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread John Hearns
Thanks Chris. I worked in one place which was setting up Reframe. It looked to be complicated to get running. Has this changed? On Thu, 30 Apr 2020 at 20:09, Chris Samuel wrote: > On 4/30/20 6:54 am, John Hearns wrote: > > > That is a four letter abbreviation... > > Ah you mean an ETLA

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread Black, Brady P
You can launch clck with either slurm (which means you do not need to provide a nodefile, clck will use whatever nodes you are allocated), or with a nodefile - which is typically more of a standalone operation when the system is in maintenance mode. To help solve the mpirun issue it would be

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread Prentice Bisbal via Beowulf
Brady, Thanks. I probably will pick your brain more, if you don't mind as I delve further into this. On 4/30/2020 11:49 AM, Black, Brady P wrote: Hi - Intel Cluster Checker person chiming in. To answer your question Prentice about runtime of Cluster Checker (CLCK), this will depend on

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread Prentice Bisbal via Beowulf
When you launch your clck jobs, do you launch them with slurm, or do you use a nodefile? When I use a nodefile, I get an error that it can't call mpirun on one of the nodes, or something like that. I'd provide the exact error message, but I don't have access to it at the moment. Prentice On

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread Chris Samuel
On 4/30/20 6:54 am, John Hearns wrote: That is a four letter abbreviation... Ah you mean an ETLA (Extended TLA). I've not used ICC but we do use Reframe (from CSCS) at work for testing both between maintenances on our test system for changes we're making and also after the maintenance as a

[Beowulf] ibswinfo, a tool to monitor unmanaged Infiniband switches

2020-04-30 Thread Kilian Cavalotti
Dear Beowulfers, If your clusters use Infiniband, you know there are only two types of switches: managed or unmanaged. The former come with SSH, a web interface, SNMP and everything ; the latter come with LEDs. The only (and officially recommended) way to monitor unmanaged switches is to go take

[Beowulf] Reframe (was Re: [External] Re: Intel Cluster Checker)

2020-04-30 Thread Chris Samuel
On 4/30/20 12:14 pm, John Hearns wrote: Thanks Chris.  I worked in one place which was setting up Reframe. It looked to be complicated to get running. Has this changed? To be honest I am not sure, another team at NERSC set it up so I just check out our local git repo and run it with:

Re: [Beowulf] ibswinfo, a tool to monitor unmanaged Infiniband switches

2020-04-30 Thread Darren Wise
Nice one, I own an HP Voltaire 4036 which is managed but am still happy to checkout the github link. Thanks very much for informing us as I'm sure it will be of huge use to others users, including myself. Kind regards, Darren Wise On 30 April 2020 21:57:46 BST, Kilian Cavalotti wrote: >Dear

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread John Hearns
That is a four letter abbreviation... Intel clearly needed to expand the namespace. On Thu, 30 Apr 2020 at 14:35, Prentice Bisbal wrote: > Intel abbreviates the cluster checker as "clck" > On 4/30/20 5:13 AM, Jim Cownie wrote: > > Bewarel of a TLA collision here. ICC is normally the Intel C

Re: [Beowulf] Intel Cluster Checker

2020-04-30 Thread Michael Di Domenico
i played with it about a year ago since i get it as part of the intel compiler bundle we pay for. it was overly complicated to install and run and didn't seem worth while. kind of like getting a piece of ikea furniture but then trying to use a phillips screw driver to build it instead of the

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread Darren Wise
They could go all out an apple cool on this one, add an lowercase "i" for Intel to the beginning and have iCLCK upon the hour after though apple might sue for IP violation issues claiming they had a new mouse coming to the market and 50 patents no-one knew about to bolster the media grabbing

Re: [Beowulf] Intel Cluster Checker

2020-04-30 Thread Black, Brady P
Hi - Intel Cluster Checker person chiming in. To answer your question Prentice about runtime of Cluster Checker (CLCK), this will depend on which set of tests or framework definition (FWD) you use and the number of servers. The default fwd, is health_base which should run in a matter of

Re: [Beowulf] [External] Re: Intel Cluster Checker

2020-04-30 Thread Prentice Bisbal via Beowulf
Intel abbreviates the cluster checker as "clck" On 4/30/20 5:13 AM, Jim Cownie wrote: Bewarel of a TLA collision here. ICC is normally the Intel C Compiler, or C/C++ compiler suite (since you invoke the C compiler as “icc”). :-) On 30 Apr 2020, at 08:37, John Hearns

Re: [Beowulf] Intel Cluster Checker

2020-04-30 Thread Jim Cownie
Bewarel of a TLA collision here. ICC is normally the Intel C Compiler, or C/C++ compiler suite (since you invoke the C compiler as “icc”). :-) > On 30 Apr 2020, at 08:37, John Hearns wrote: > > Thanks Prentice. Iw as discussing this only to days ago... > I used the older version of ICC when

Re: [Beowulf] Intel Cluster Checker

2020-04-30 Thread John Hearns
Thanks Prentice. Iw as discussing this only to days ago... I used the older version of ICC when working at XMA int the UK. When the version as changed I found it a lot more difficult to implement. I looked two days ago and the project seems to be revived, and incorporated into oneAPI Is anyone