Re: [xcat-user] Redhat/Rocky support

2023-11-15 Thread David D Johnson
We built this script to bring up ipoib on RHELS 9.2. Your mileage may vary [root@xcat02 postscripts]# more ipoib #!/bin/bash # Define the log function function log { echo "$(date +"%Y-%m-%d %H:%M:%S") - $1" >> /root/post.log 2>&1 logger -t xcat "$1" } #

Re: [xcat-user] Dell PowerEdge C6525 -- xcat diskless install hanging

2021-05-18 Thread David D Johnson
Sorry, i think the stuff I was reading was hang "before" the crng init comes back ready, not after. > On May 18, 2021, at 8:30 AM, david_john...@brown.edu wrote: > > There have been reports of this in various discussion groups, one suggestion > is to plug in a mouse and move it around to help

Re: [xcat-user] [External] Can xCAT DHCP server offer addresses on IPoIB?

2021-03-16 Thread David D Johnson
Thanks, I'll give it a try tomorrow. I do use static assignment for the primary ether and generic IPoIB interfaces, but this is a dedicated VLAN, passing through an IB to ethernet gateway, used to access specialized storage. The interface will be brought up only as needed, and taken down

Re: [xcat-user] PXE-E18: Server response timeout.

2020-05-27 Thread David D Johnson
When you have two dhcp servers on the same wire/subnet, need to make sure that the MAC address is only recognized by one of them. If the "production" server is running xcat, use makedhcp -d nodename to get rid of the association over there, and presume (or check, with makedhcp -q nodename)

Re: [xcat-user] New XCAT installtion PXE boot issue.

2018-07-03 Thread David D Johnson
The next thing that it would be doing at this point is downloading the rootimg via the http server on the management node. Maybe check the httpd logs? -- ddj > On Jul 3, 2018, at 4:28 PM, Sam Davis wrote: > > I see that this list does strip attachments. The last line on the console > when

[xcat-user] xcat node status update on compute nodes RHEL 7.3 times out

2017-06-23 Thread David D. Johnson
Something is taking 1:30 to timeout on a node shutdown / reboot. It appears to me that it’s trying to update xcat node status on the mgt node, but the network has already been shut down. It needs to run earlier if it has any chance of working at all [root@gpu001 ~]# reboot PolicyKit dae[ OK ]

[xcat-user] conserver / rcons errors after upgrade

2017-06-22 Thread David D. Johnson
Just finished updating xcat front-end node from rhels7.2 to rhels7.3, at the same time as updating from xcat from 2.12.something to 2.13.4. Since then rcons gives me these errors: [root@mgt5 etc]# rcons node475 console: invalid keyword 'sslauthority' [/root/.consolerc:3] console: invalid

Re: [xcat-user] upgrading xCAT onto new servers

2017-05-25 Thread David D. Johnson
might require typing a password if there is no tty …. Hope this is useful. — ddj Dave Johnson Brown University > On Feb 8, 2017, at 5:39 PM, Christopher Samuel <sam...@unimelb.edu.au> wrote: > > On 07/02/17 22:35, David D Johnson wrote: > >> Now the proble

Re: [xcat-user] NextScale nodes not booting from the disk

2017-05-12 Thread David D. Johnson
Sorry, misspoke on one point — we did take out Legacy Mode from the BootModes. > On May 12, 2017, at 9:51 AM, David D. Johnson <david_john...@brown.edu> wrote: > > When we got our first 5465 Lenovo nodes, we changed the noderes netboot > attribute from pxe to xnba

Re: [xcat-user] NextScale nodes not booting from the disk

2017-05-12 Thread David D. Johnson
When we got our first 5465 Lenovo nodes, we changed the noderes netboot attribute from pxe to xnba and also changed BootModes.SystemBootMode to “UEFI Mode”. Even though Legacy Only is first in the boot order, they still are able to pxe boot just fine. That is the only difference I see, maybe

Re: [xcat-user] pxe booting older hardware with newer xcat

2017-05-10 Thread David D. Johnson
> > You can disable the option rom for the FC card. There are settings in asu to > disable the option rom without disabling the card. > > From: David D Johnson [mailto:david_john...@brown.edu] > Sent: Tuesday, May 09, 2017 8:56 AM > To: xCAT Users Mailing list > Subjec

Re: [xcat-user] pxe booting older hardware with newer xcat

2017-05-09 Thread David D Johnson
nd Technology Laboratory, Beijing > Tel:(86-10)82450485 <tel:(86-10)82450485> > Email: erta...@cn.ibm.com <mailto:erta...@cn.ibm.com> > Address: 1/F, 28 Building,ZhongGuanCun Software Park, > No.8 DongBeiWang West Road, Haidian District, > Beijing, 100193, P.R.China > &g

[xcat-user] pxe booting older hardware with newer xcat

2017-05-08 Thread David D. Johnson
I’m needing to reinstall / update a dozen IBM x3650-M2 servers from 2009. These Nehalem nodes get up to the point where they should be doing PXE dhcp request, etc, but I am only seeing blank screen. No DHCP events logged. It seems xcat would not be to blame, but I’m wondering if anyone else

Re: [xcat-user] /usr/bin/ping on diskless lost capabilities rhel7.2

2017-04-13 Thread David D. Johnson
pick > tar to start with back in the day... > > -Original Message- > From: David D. Johnson [mailto:david_john...@brown.edu] > Sent: Monday, April 10, 2017 2:50 PM > To: xCAT Users Mailing list > Subject: [xcat-user] /usr/bin/ping on diskless lost capabilities rhel7.2

[xcat-user] More questions about addkcmdline

2017-04-13 Thread David D Johnson
We need to add addkcmdline="rd.driver.blacklist=mlx4_en" and maybe more modules, basically to all diskless nodes. I vaguely remember doing this sort of thing once before. Where in tabedit would I go to put this definition in once so it would get inherited? Is there a document that shows the

[xcat-user] /usr/bin/ping on diskless lost capabilities rhel7.2

2017-04-10 Thread David D. Johnson
mgt# getcap /usr/bin/ping /usr/bin/ping = cap_net_admin,cap_net_raw+p mgt# ssh compute compute# getcap /usr/bin/ping compute# Somewhere along the line, unpacking the rootimg I would guess, the binary for ping and its friends loses the required privilege / capability to actually function for

[xcat-user] statelite vs stateless

2017-04-03 Thread David D. Johnson
We have several hundred stateless compute nodes, but we’re starting to wonder if we should be using statelite provisioning instead. The primary issue we would hope to address is having a place to save Kerberos host credentials. Of course we could do this at boot time using some kind of post

Re: [xcat-user] upgrading xCAT onto new servers

2017-02-08 Thread David D Johnson
West Road, Haidian District, > Beijing, 100193, P.R.China > > > - Original message - > From: David D Johnson <david_john...@brown.edu> > To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> > Cc: > Subject: Re: [xcat-user] upgrading xCAT ont

Re: [xcat-user] upgrading xCAT onto new servers

2017-02-07 Thread David D Johnson
at will without any manual intervention. > On Feb 7, 2017, at 2:02 PM, David D. Johnson <david_john...@brown.edu> wrote: > > Drilling down deeper, seems to be two different situations. > > On nodes without X11, remoteshell script takes less than a second, > -rw--- 1 roo

Re: [xcat-user] upgrading xCAT onto new servers

2017-02-07 Thread David D. Johnson
it was NetworkManager, but it turns out it was firewalld. (chroot . systemctl disable firewalld ) — ddj > On Feb 7, 2017, at 6:35 AM, David D Johnson <david_john...@brown.edu> wrote: > > That was already the case (IP of mgt1 and IP of mgt[2] are the forwarders). > I don't belie

Re: [xcat-user] upgrading xCAT onto new servers

2017-02-07 Thread David D Johnson
ng,ZhongGuanCun Software Park, > No.8 DongBeiWang West Road, Haidian District, > Beijing, 100193, P.R.China > > > - Original message - > From: "David D. Johnson" <david_john...@brown.edu> > To: "xcat-user@lists.sourceforge.net" <xcat-user@l

[xcat-user] upgrading xCAT onto new servers

2017-02-03 Thread David D. Johnson
We’re upgrading cluster mgt node hardware and software at the same time, going from 2.8.3 to 2.13.1, and from centos6.7 to rhels7.2. I have the new frontend installed and somewhat functional. Right now I’m needing to clone the DNS / named from “mgt1” that is still authoritative for the

[xcat-user] Customizing /etc/named.conf with xCAT

2016-10-19 Thread David D Johnson
I need to add a delegated zone to the named configuration on the xCAT management nodes. It seems to me that /etc/named.conf gets clobbered every time makedns -a or makedns -d gets run. Currently the mgt1 node is authoritative for oscar.ccv.brown.edu , and a bunch of

Re: [xcat-user] discovering interface names dynamically

2016-06-30 Thread David D Johnson
A simple place to start would be “ls /sys/class/net”… after that heuristics to eliminate “lo” and “ib*”, use ethtool to see which of the rest has link up, and choose the one with the lowest number that is lit up. — ddj > On Jun 30, 2016, at 1:17 PM, Andrew Loftus wrote:

Re: [xcat-user] psh *** ssh exited with error code 3

2016-03-19 Thread David D Johnson
I can’t get rid of the quote level, but here is a workaround: > psh cn1 'service snmpd status || true' > On Mar 17, 2016, at 7:56 PM, Stanislav Sergienko wrote: > > Hello, > > Every time I run the service status command and the service is not running I > get ssh exited

Re: [xcat-user] Problems with x desktop after diskless image build /boot

2016-03-02 Thread David D Johnson
because of pre-linking When I do yum update dbus on the diskless node booted with this image, the problem goes away. Why don't the groups get added to the group file before the transaction/postinstall starts? -- ddj On Mar 2, 2016, at 2:56 PM, David D Johnson <david_john...@brown.edu>

Re: [xcat-user] Problems with x desktop after diskless image build /boot

2016-03-02 Thread David D Johnson
From output of running ./genimage -i eth0 -n dca,8021q,igb,bnx2,tg3 -o centos6.5 -k 2.6.32-358.23.2.el6.x86_64 -p dev: Installing : 1:gdm-2.30.4-52.el6.x86_64 688/817 warning: group gdm does not exist - using root

Re: [xcat-user] issue programing bmc

2015-10-06 Thread David D Johnson
> Is there way to force it to reprogram again? > > Damir > > On Tue, Oct 6, 2015 at 10:59 AM David D Johnson <david_john...@brown.edu> > wrote: > My suspicion is that your IMM2 is set to use the dedicated IMM ethernet port, > but you intended to use the shared IMM/eth

Re: [xcat-user] Golden Client debugging help needed

2015-09-09 Thread David D Johnson
m > Tel: 86-10-82453253 > Address: Building 28, ZhongGuanCun Software Park, > No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC > > 北京市海淀区东北旺西路8号中关村软件园28号楼 > 邮编: 100193 > > David D Johnson ---10/09/2015 01:11:39 AM---I'm trying to follow > the instru

[xcat-user] Golden Client debugging help needed

2015-09-09 Thread David D Johnson
I'm trying to follow the instructions on http://sourceforge.net/p/xcat/wiki/Using_Clone_to_Deploy_Server/ to generate a custom diskless boot image from a more generic one. Original is barebones CentOS 6.5. Added enough rpms to install GPFS and Mellanox OFED, then deleted the extras like

Re: [xcat-user] NextScale deployment kernel crash

2015-06-25 Thread David D Johnson
]-^M Your hardware is unsupported. Please do not report bugs, panics, oopses, etc., on this hardware.^M On Jun 25, 2015, at 3:46 PM, David D Johnson david_john...@brown.edu wrote: Well, that got me a shell, and the system got the correct address on eth0. Waiting for device with address 40:f2:e9

Re: [xcat-user] NextScale deployment kernel crash

2015-06-25 Thread David D Johnson
in them regardless, complete with ssh and all. From: David D Johnson [mailto:david_john...@brown.edu] Sent: Thursday, June 25, 2015 2:00 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] NextScale deployment kernel crash I may have jumped to conclusions about the reason, but in any

Re: [xcat-user] NextScale deployment kernel crash

2015-06-25 Thread David D Johnson
root 225896 Apr 20 13:07 lib/modules/2.6.32-358.23.2.el6.x86_64/kernel/drivers/net/tg3.ko [root@mgt1 settings]# Thanks for the help. Should I talk to xcat support? (we have a contract). -- ddj From: David D Johnson [mailto:david_john...@brown.edu] Sent: Thursday, June 25, 2015 3:46 PM

Re: [xcat-user] NextScale deployment kernel crash

2015-06-25 Thread David D Johnson
I may have jumped to conclusions about the reason, but in any case the two new M5 machines don't boot. This is the line specifying drivers from our build script: ./genimage -i eth0 -n dca,8021q,igb,bnx2,tg3 -o centos6.5 -k 2.6.32-358.23.2.el6.x86_64 -p comp As to the ethernet interfaces, from

Re: [xcat-user] xCAT 2.8.4 and HP IPMI

2014-08-05 Thread David D Johnson
My recollection is foggy, but I think they stopped supporting some weak cipher(s), and our system was missing the SSL package(s) (RPM) that provided any of the remaining stronger ciphers. In our case adding the RPM fixed things. However your ipmitool output that is missing the whole section

Re: [xcat-user] xCAT 2.8.4 and HP IPMI

2014-08-01 Thread David D Johnson
Somewhere along the line, I dimly recall a change to the cipher suites that xcat supports, and which one is used by default. But since your output shows no cipher suites at all, perhaps you might want to check the firmware version for the BMC, and if it is up-to-date it's possible that

Re: [xcat-user] runimage didn't find the binary for execution

2014-03-03 Thread David D Johnson
Can you check with ldd or file on ./cli -- if it's a 32 bit binary and you only have 64 bit libraries, there might be a disconnect. -- ddj On Mar 3, 2014, at 3:55 PM, Stagneth, Andre wrote: Path isn’t the problem nodeset comp430 shell rpower comp430 reset scp cli comp430:/tmp ssh

Re: [xcat-user] Interactive install

2014-02-28 Thread David D Johnson
OK, I found the file in /tftpboot/pxelinux.cfg/cache001 (which is the hostname). I stripped out the ks= and change the ksdevice= to the mac address (I had previously had eth2, which didn't work at all). Here is the lsdef: [root@mgt1 tftpboot]# lsdef cache001 Object name: cache001

Re: [xcat-user] Name the console/hardware manager

2014-02-01 Thread David D Johnson
Conflation can often imply improper mixing, as in two stories getting mashed together. On Feb 1, 2014, at 9:24 AM, Jarrod Johnson jarrod.b.john...@gmail.com wrote: So I wanted to nail down a name before getting ready to push the tree public. I was considering the following names:

[xcat-user] gettyset seems to be broken with RHEL 6 xcat 2.8.3 ttyS1

2014-01-28 Thread David D Johnson
I'm baffled where gettyset actually comes from... in any case, the there are error messages coming from gettyset since we moved from RHEL 5 to RHEL 6. Machines with console redirection on ttyS0 seem to end up working OK, but I have new machines where the console is on ttyS1. No booting info

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread David D Johnson
I've been lurking on this discussion, and just checked to see what we've got -- nbroot or genesis -- and we have both of them. I had given up on node discovery years ago, we originally used the switch port numbers and forwarding tables to assign node names. Now I use ASU to collect the macs,

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread David D Johnson
On Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmi...@renci.org wrote: Comments inline... On 1/22/14, 8:08 AM, David D Johnson wrote: I've been lurking on this discussion, and just checked to see what we've got -- nbroot or genesis -- and we have both of them. I had given up on node

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread David D Johnson
, but scaling it up causes a lot of ambiguity to trudge through. Inactive hide details for David D Johnson ---01/22/2014 08:53:19 AM---On Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmills@renDavid D Johnson ---01/22/2014 08:53:19 AM---On Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmi...@renci.org

[xcat-user] Embedding GPFS 3.5 into xcat diskless boot images

2014-01-22 Thread David D Johnson
We had a method that worked for the last 4.5 years up to this point, but today it stopped working. The old way had a copy of mgt node's /var/mmfs/gen embedded in the image. Adding new nodes was tedious, but doable: ssh to each node, remove the previous contents of /var/mmfs/gen, mmaddnode

Re: [xcat-user] Embedding GPFS 3.5 into xcat diskless boot images

2014-01-22 Thread David D Johnson
during image build and set rc.local to run mmsdrrestore on the stateless node to sync it with the cluster. Regards, Christian Caruthers Senior Consultant System x Linux HPC Mobile: 757-289-9872 Sent from Lotus Traveler David D Johnson --- [xcat-user] Embedding GPFS 3.5 into xcat diskless

[xcat-user] genimage doesn't produce initrd files for centos 6.4

2014-01-21 Thread David D Johnson
Trying to rebuild netboot images for centos 6.4, using script that has worked for ages with centos 6.2 and 6.3. After running genimage, there are no initrd-state{less,lite}.gz files: [root@mgt1 netboot]# ls */x86_64/comp centos6.2/x86_64/comp: initrd-stateless.gz initrd-statelite.gz kernel

Re: [xcat-user] dhcp timeout in xnba

2013-10-28 Thread David D Johnson
We had a problem where the switch ports had been properly configured with spanning-tree edge mode for the first 32 ports, but when we added new nodes, we didn't notice that ports 33-48 on the switch had been left to the default. Edge mode comes up quickly, but the default took too long to