[xcat-user] Morgan Stanley stor204 Install Status 3/13
Action Items: * The 1Gb provisioning connection for stor204lgc1n1 is down. * The 100Gb ports for DC1 apprear not to support jumbo frames. This evening, I was able to: * Install & configure SN1 * Discover & install DSS nodes * Verify SAS & Drive topology * Verify Drives w/ read/write I/O test * Verify drive & enclosure firmware levels Tomorrow, I will be working on stor0010 at DC1. I'll be at DC2 Friday evening to continue work on stor204. Regards, Christian Caruthers Lenovo Professional Services Mobile: +1 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Ipmitool support for old BMC cipher suite 3
Don, Confluent was originally designed to run alongside xCAT, and that process is pretty easy: https://hpc.lenovo.com/users/documentation/configconfluent_xcat.html Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Don Avart Sent: Wednesday, January 10, 2024 11:24 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] [External] Ipmitool support for old BMC cipher suite 3 Jarrod, Would/could goconserver from Confluent be brought into xCAT relatively easily? Don Avart CTO RedLine Performance Solutions, LLC (703) 634-5686 dav...@redlineperf.com<mailto:dav...@redlineperf.com> On Jan 10, 2024, at 11:09 AM, Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: gocons is 'goconserver'. confluent has a baked in console handler for ipmi that is written in python. One could imagine a modification to the ipmitool invocation to try default and add -C 3 if it fails (exits within a second or so) From: David Johnson mailto:david_john...@brown.edu>> Sent: Wednesday, January 10, 2024 11:02 AM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] [External] Ipmitool support for old BMC cipher suite 3 For console I’m still broken with both goconserver and ipmitool (w/o -C 3). I thought gocons came from confluent — is there a better way to do console now from confluent? -- ddj Dave Johnson On Jan 10, 2024, at 10:44 AM, Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: Well, I suspect it works when the amended result was posted that the xCAT fallback did function fine. So it's a matter of ipmitool's fallback being perhaps too picky or is outright broken. In xCAT/confluent we try 17 and if failed, just start over at 3. ipmitool tries to more carefully decide what it's initial attempt will be based on advertised support (I think from a cursory glance). So I could imagine how a strange response to supported ciphers could steer ipmitool wrong when xcat/confluent can fare better. Unfortunately on our side we deprecated use of ipmitool for console, so I'm a bit rusty in evaluation. From: Ryan Novosielski mailto:novos...@rutgers.edu>> Sent: Tuesday, January 9, 2024 10:23 PM To: Jarrod Johnson mailto:jjohns...@lenovo.com>> Cc: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] [External] Ipmitool support for old BMC cipher suite 3 That’s a good question! We don’t currently have a Confluent system running anything newer than RHEL7 managing anything other than DSS-G equipment, but we’re planning to upgrade our management system to RHEL9 soon, or alternatively could add an additional machine to one of the DSS-G clusters to see. -- #BlackLivesMatter || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos...@rutgers.edu<mailto:novos...@rutgers.edu> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Jan 9, 2024, at 18:16, Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: Curious, how does confluent ipmi interaction work against those systems? does it manage to successfully downgrade transparently? From: Ryan Novosielski via xCAT-user mailto:xcat-user@lists.sourceforge.net>> Sent: Tuesday, January 9, 2024 5:37 PM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Cc: Ryan Novosielski mailto:novos...@rutgers.edu>> Subject: Re: [xcat-user] [External] Ipmitool support for old BMC cipher suite 3 I can confirm that that last part is not true: root@fw01-hpc-hill:/home/novosirj 11:11 PM# ipmitool -U USERID -I lanplus -H master-imm chassis status Password: Error in open session response message : no matching cipher suite Error: Unable to establish IPMI v2 / RMCP+ session …and suspected as much since I had to learn anything about the cipher suites and -C. :-D Maybe the version provided by RHEL derivatives has defaults or something? We’re on RHEL8/9 where we’re seeing it. — #BlackLivesMatter || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos...@rutgers.edu<mailto:novos...@rutgers.edu> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Jan 9, 2024, at 16:24, Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: In what context do you find use of ipmitool with '-C'? I was checking the ipmi console backend and it doesn't seem to have that. rpower and such should try SHA256, fallback to SHA1 (equivalent to -C 3) The ipmi backend for conserver, if used, doesn't currently attempt
Re: [xcat-user] [External] Re: Announcement: xCAT Project End-Of-Life planned for December 1, 2023
Same for me. Would like to attend any virtual discussion that is had. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Nathan A Besaw via xCAT-user Sent: Monday, October 2, 2023 1:59 PM To: xCAT Users Mailing list Cc: Nathan A Besaw Subject: [External] Re: [xcat-user] Announcement: xCAT Project End-Of-Life planned for December 1, 2023 I'm not attending SC23 in person this year, but I am interested in attending any virtual discussions about the future of xCAT. From: Don Avart mailto:dav...@redlineperf.com>> Sent: Thursday, September 28, 2023 5:56 PM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] [External] Re: Announcement: xCAT Project End-Of-Life planned for December 1, 2023 All, Would there be interest in an unofficial “birds of a feather” type meeting for xCAT at SC23 to discuss the future of xCAT? I may be able to line up a conference room for folks attending to get together. If there’s interest I assume we ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!2S-hjDD-ZPkamgYVmE_LzbS184yquzlXuBYyicQMQtQxznbV0EKinRBj0yYgruCIoyBICeJoD92j_Ha4s1hnprE9NuY_clAmFxN6bOU$> Report Suspicious <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!2S-hjDD-ZPkamgYVmE_LzbS184yquzlXuBYyicQMQtQxznbV0EKinRBj0yYgruCIoyBICeJoD92j_Ha4s1hnprE9NuY_clAmFxN6bOU$> ZjQcmQRYFpfptBannerEnd All, Would there be interest in an unofficial “birds of a feather” type meeting for xCAT at SC23 to discuss the future of xCAT? I may be able to line up a conference room for folks attending to get together. If there’s interest I assume we can also include a Zoom or Teams conference for those unable to attend. Don Avart CTO RedLine Performance Solutions, LLC (703) 634-5686 dav...@redlineperf.com<mailto:dav...@redlineperf.com> On Sep 21, 2023, at 5:13 PM, Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: Yes, we are committed to it being open source ongoing. I won't rule out proprietary things built on top of it, but at least in all the ways that exist today and the CLI I don't imagine any changes. Currently, the GUI is not technically open sourced (though everyone gets the source code, but no redistribution). I do hope to at least open source our upcoming browser library that makes writing a webui with all the async behaviors a bit more trivial (which is what the next WebUI will be written with). Yes, non-Lenovo functions are welcome. As of 3.8.1, after testing the redfish update on a 'generic' openbmc system and Bluefield 3 BMC, the push firmware updates have been placed in generic. Currently the only non-Lenovo vendor specific behavior is a bit to deal with a peculiar choice with Dell virtual media, other things are plain redfish. A target rich area would be the out-of-band discovery code, since I wager all the major implementations are capable. I did see one OpenBMC solution that wouldn't really work, but I've seen other OpenBMC implementations that were workable. Most other stuff is at least described by standards (uefi settings, firmware updates for example) and vendors that bother to implement it tend to stick to the standard, so far. The trickiest thing is 'nodehealth', where redfish provides a HealthRollup, but I feel like systems rarely use it well enough. Maybe that's just a motivation to push vendors to use that more consistently if it's a problem... In any event, configbmc (bmcsetup replacement) uses standard IPMI, should be easier to extend than the bmcsetup script has been, and PXE discovery works like it can in xCAT (though without the requirement to actually boot a Linux anymore, discovery happens on the DHCPDISCOVER packet in the PXE attempt). So the 'worst case' is as far as we ever bothered to push xCAT, discovery wise. Governance is a matter that can be discussed. Currently I am the arbiter of pulls, but we can discuss other options. With xCAT when we were still tied to xCAT, we maintained a Lenovo branch so that I was no longer arbiter of master, but we still had freedom to release changes without getting them in master (e.g. SHA256 IPMI support was one that we could drive into Lenovo, but didn't work hard enough to get into master). The 'lenovobuild' branch in xcat-core is what I reference, all still open source, just the ability to snag pull requests even if they don't make it through to main branch on time for one of our requirements. On the last, much depends on what is seen as missing that we want to continue. Off the top of my head, confluent doesn't push ISC dhcp configuration (because it has a built in PXE server that can work with either an uncontrolled DHCP server or in a static only fashion) or ISC BIND (it still, upon request, helps make /etc/hosts files, but s
Re: [xcat-user] [External] question on management node migration
You're correct that copycds is missing from the write up, and I don't see it mentioned in the "Install xCAT" link. It could be done at any time after xCAT is installed on the new MN. As for "static groups," are you referring to xCAT groups? If so, those should be restored from the xCAT DB dump. OS groups, would come from the /etc/group file (though I would NOT just overwrite the new one - system GIDs aren't guaranteed to be the same from install to install, and overwriting them can cause big headaches.) Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Imam Toufique Sent: Thursday, June 29, 2023 4:06 PM To: xCAT Users Mailing list Subject: [External] [xcat-user] question on management node migration Hello, I am working to do a management node migration, and I have a question. I am following this link , https://xcat-docs.readthedocs.io/en/stable/advanced/migration/migration.html , and I see everything that needs to be done here, but I don't see any mentions about when / how to restore the OS images and static groups that I have. Should I run copycds to install all the OS images and create the static groups manually after installing xcat rpms to the new node and then restore all the files ( as mentioned in steps 1.1 through 1.11?) ? Or should I do something else? Any suggestions? Thanks in advance. ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: BitTorrent distribution of stateless images with xCAT interesting to anyone?
Those are set by xCAT. $MASTER comes from the master field in the site table (site.master), and $NODE comes from the nodelist.node setting (what you used when you first defined the node in xCAT). Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Tomer Shachaf Sent: Thursday, April 13, 2023 6:31 AM To: xCAT Users Mailing list Subject: [External] Re: [xcat-user] BitTorrent distribution of stateless images with xCAT interesting to anyone? Litel question from where the node gets the variables $MASTER and $NODE From: Mark Gurevich via xCAT-user Sent: Wednesday, 12 April 2023 15:52 To: xCAT Users Mailing list Cc: Mark Gurevich Subject: Re: [xcat-user] BitTorrent distribution of stateless images with xCAT interesting to anyone? Tomer, have you tried following these instructions? https://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/manage_clusters/ppc64le/discovery/mtms/discovery_using_defined.html From: Tomer Shachaf Sent: Wednesday, April 12, 2023 8:24 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] [External] BitTorrent distribution of stateless images with xCAT interesting to anyone? Can anybody help me with a guide to how to configure MTMS discovery or sequential discovery to nodes ? Thanks a lot בברכה , תומר שחף | מהנדס אינטגרציה ותשתיות | חטיבת אינטגרציה ותשתיות | מטריקס | נייד 054-2686841 | tomershac@ matrix. co. il | ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Can anybody help me with a guide to how to configure MTMS discovery or sequential discovery to nodes ? Thanks a lot בברכה , תומר שחף | מהנדס אינטגרציה ותשתיות | חטיבת אינטגרציה ותשתיות | מטריקס | נייד 054-2686841 | tomers...@matrix.co.il<mailto:tomers...@matrix.co.il> | www.matrix.co.il<http://www.matrix.co.il/> [cid:~WRD0954.jpg] On 5 Apr 2023, at 16:37, Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: On the not even root is allowed to make changes, there are mechanisms to at least partially get there. However the nature of things is that many PCIe devices do not consider such a boundary. Disabling KCS in a BMC is an option provided now. Without that access, at least our platforms don't provide a path to UEFI or BMC firmware updates or configuration changes. However most devices you may add would likely still be open to manipulation. However, that manipulation is frequently bounded nowadays, with many components having a first stage firmware loader that refuses to continue if the firmware payload does not pass a signature check. In so far as SecureBoot, it in practice does protect the kernel space, but user space is left uncovered (e.g. something like a malicious /etc/shadow is impossible to cover in the SecureBoot scheme). The approach there would be trusted boot, with encrypted boot and sealing the encryption key to PCRs according to your desire for tamper detection and lockout. In this case, you could for example have booting from a rescue disk result in the system being unable to decrypt the boot volume (you may optionally have a 'recovery' password in another slot to allow password based recovery). Or if secureboot is disabled, it can't decrypt the boot volume. In the confluent diskless boot, it extends one of the PCRs so that not even root on the system can retrieve the API key from TPM. The PCRs cover things like firmware loads on boot components, so while you may not prevent a firmware change, you may be able to render an attacked device unable to read boot volume. This one is tricky ground, as you are balancing protecting the data against attacks versus accidentally locking yourself out through an intended, innocuous change. From: Dr. Thomas Orgis mailto:thomas.or...@uni-hamburg.de>> Sent: Wednesday, April 5, 2023 9:10 AM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] [External] BitTorrent distribution of stateless images with xCAT interesting to anyone? Some details on our setup in light of the approach Jarrod outlined … Am Wed, 29 Mar 2023 17:37:30 + schrieb Jarrod Johnson mailto:jjohns...@lenovo.com>>: > On confluent diskless, there is an interesting benefit that becomes a > challenge for bittorrent: a typical diskless node never downloads the > whole diskless image. This means less ram sucked up by the diskless > image, and also that the diskless image can be large without pruning. I guess this is mitigated by our OS image being rather minimal to begin with. It only has the basic system software and drivers, up to a working C/C++ compiler setup that is able to bootstrap further software. Such further software is provided in a versioned tree via NFS and managed via environment modules. So such an approach to optimize the usage of a large OS image by only keeping necessary parts i
Re: [xcat-user] [External] search in resolv.conf
The only way I know of is through a custom postscript. NOTE that with RHEL/CentOS/etc. 8.x resolv.conf is managed by NetworkManager. To avoid it being accidentally overwritten, you can rename it to resolv.somethingelse and lilnks resolv.conf to that. NetworkManager will not modify a symlink. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: SOPORTE MODEMAT via xCAT-user Sent: Wednesday, December 14, 2022 6:05 PM To: xCAT Users Mailing list Cc: SOPORTE MODEMAT Subject: [External] [xcat-user] search in resolv.conf Importance: High Hi. Please tell me how can I get the "search domain" populated in the network interfaces or "search" in the /etc/resolv.conf in each compute node, I am using xcat 2.16.4 on Centos 8.4. All the information about domain, forwarders ands nameserver are in the networks and site table. Thank you in advance for your help. Kind regards. ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] SSH hang between management node and compute node being re-installed (syncfiles)
If you boot the node to the genesis image, can you ssh into it? If so, you may not be able to ssh into the node when syncfiles runs, but console should work. The anaconda installer is a tmux session, so you should be able to go to the next window to get a shell prompt (Ctrl-b n). Once at a shell prompt, you should be able to troubleshoot the network connection between the node and xCAT. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Nicolas Roosen Sent: Wednesday, April 20, 2022 06:37 To: xCAT Users Mailing list Subject: [External] [xcat-user] SSH hang between management node and compute node being re-installed (syncfiles) Hello, we have a node deployment which is stuck at the syncfiles step, because the SSH from the management node to the compute node is hanging. I tried a manual SSH and it behave the same way. Never seen this before. The management node is RHEL 8.3 with xCAT 2.16.3, the compute node CentOS 7.3 Has anyone already experienced this ? Thanks. -- Nicolas Roosen Technical Consultant HPC & AI Business Group ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fxcat-userdata=05%7C01%7Cccaruthers%40lenovo.com%7C654bcf1730c849c22c3708da22b9e043%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C637860479139807878%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=C6rBNDo4IsaieJ%2BFnCbHZZ0N9lMwcjgDizcJyKh9Kww%3Dreserved=0 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Configure secondary network nic during installation
You could feasibly use callouts within the kickstart template to confiruee the network using the nics table. Alternatively, you could try setting up a prescript to configure network interfaces. The "built-in" way to configure other interfaces is to use the confignetwork postscript and the nics table. If you search for confignetwork on the docs site, you'll see plenty of links to set up networking (additional interfaces, aliases, bonds, etc.) https://xcat-docs.readthedocs.io Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Nicolas Roosen Sent: Wednesday, April 20, 2022 10:26 To: xCAT Users Mailing list Subject: [External] [xcat-user] Configure secondary network nic during installation Hello, is there a way to configure a secondary network interface during the initial deployment of a diskfull node ? [installnic] works for the primary interface, but in order to access external repositories during the deployment we need to setup the secondary NIC as well. I can create a postscript which runs right before the other packages installation, but if there is a internal xCAT way for doing this, that would be even better. Thanks. -- Nicolas Roosen Technical Consultant HPC & AI Business Group ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.sourceforge.net%2Flists%2Flistinfo%2Fxcat-userdata=05%7C01%7Cccaruthers%40lenovo.com%7C55516ae9775d4b2df89b08da22d9ca6c%7C5c7d0b28bdf8410caa934df372b16203%7C0%7C0%7C637860616179799321%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=T6NvnGapyZdYJt1KyK1lewuRYyDjWq5JavyhXVGbgQ4%3Dreserved=0 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] xCAT getmacs from Cisco via SNMP?
xCAT should be able to pull SNMP invo from a Cisco switch just like it would any other switch. You'll want to make sure you have SNMP configured properly in the switches table. It also might be necessary to get the switchport name correct. Sometimes, you can just use, for example "1," but I've seen cases where you need to spell out the port as the switch knows it. Like: "Ethernet1/1." You should be able to use the following snmpwalk command (with the right SNMP version settings) to see how the switch names the ports. The following works with snmpv1. snmpwalk -v 1 -c {COMMUNITY} {SWITCH_NAME} .1.3.6.1.2.1.31.1.1.1.1 Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Hannum, Keith Sent: Tuesday, April 12, 2022 11:39 To: xcat-user@lists.sourceforge.net Subject: [External] [xcat-user] xCAT getmacs from Cisco via SNMP? Does xcat getmacs have a built-in way to get macs for a node from a cisco switch via snmp? Or via any other method from a cisco switch? ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] ability to integrate with Ansible (Coming soon)
Confluent information can be found at hpc.lenovo.com. As for integrating w/ xCAT. At a minimum, Confluent can replace the console server. Info on doing that is here: https://hpc.lenovo.com/users/documentation/configconfluent_xcat.html Once makeconfluentcfg is run, you can use confluent to replace most all xCAT management tools (“r” commands) that don’t involve node provisioning. Beyond node management, Confluent can discover and configure BMCs w/o power-up. In order for this to work, you would define the BMC switchport info rather than the node boot interface info. Finally, Confluent adds a web interface. Info on that can be found here under “Enabling the Web UI”: https://hpc.lenovo.com/users/documentation/installconfluent_rhel.html Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Imam Toufique Sent: Saturday, May 15, 2021 00:25 To: xCAT Users Mailing list Subject: Re: [xcat-user] [External] ability to integrate with Ansible (Coming soon) Where do I find documentation on confluent and how does it integrate with xcat ? thanks On Fri, May 14, 2021 at 10:36 AM Jarrod Johnson mailto:jjohns...@lenovo.com>> wrote: Note I can't speak to xcat-invenntory, but the upcoming confluent release has some definition of ansible integration. In confluent 3.2, a deployment profile can have scripts executed directly on nodes (basically the same as postscripts): /var/lib/confluent/public/os//scripts/firstboot.d /var/lib/confluent/public/os//scripts/post.d Additionally, playbooks may be remotely triggered at points of the deployment: /var/lib/confluent/public/os//ansible/firstboot.d /var/lib/confluent/public/os//ansible/post.d The 'hosts' field will be specifically whatever node that enters that phase. The play will be executed as the confluent user on the deployment server, targeting the deploying server (with a separate automation user key that you must opt into, since confluent isn't allowed to read user's private keys). Also, if pertinent, the corresponding release of confluent genesis has enough python to be targeted by ansible plays. It has 'onboot' scripts and ansible plays supported. However, it's far more optional in confluent (emphasis on standby power discovery, PXE discovery if needed can be done during the PXE attempt, and the 'configbmc' script if needed should also work as 'pre.d' script for an installing system, to roll most genesis actions into the OS installers instead of requiring genesis to boot first. That is a specific interpretation of 'ansible support' and I welcome more requests (e.g. for confluent to provide facts and/or a module to use something in lieu of cmdline. From: James Goebel mailto:jkgoe...@bu.edu>> Sent: Friday, May 14, 2021 11:58 AM To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net> mailto:xcat-user@lists.sourceforge.net>> Subject: [External] [xcat-user] ability to integrate with Ansible (Coming soon) On the xcat-inventory github page there is the suggestion that Ansible integration is planned. Has there been any progress on this? https://github.com/xcat2/xcat-inventory ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user -- Regards, Imam Toufique 213-700-5485 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] dynamic disk partitioning with xcat
Are you asking about doing this as a prescript? I'm not sure what environment variables are in place at that point. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: SOPORTE MODEMAT via xCAT-user Sent: Monday, March 8, 2021 12:45 To: xCAT Users Mailing list Cc: SOPORTE MODEMAT Subject: Re: [xcat-user] [External] dynamic disk partitioning with xcat Hello Christian. Thank you so much for your reply. I really appreciate it. But I wonder if I use that as a postscript, can I read the $NODE variable at the time of the disk partitioning process ? Saludos cordiales. Msc. Mercy Anchundia Ruiz. Especialista de TICS Tlf. +59322976300 EXT 1537 https://hpcmodemat.epn.edu.ec/ Síguenos en Twitter: @HPCModemat MODEMAT -EPN From: Christian Caruthers mailto:ccaruth...@lenovo.com>> Sent: lunes, 8 de marzo de 2021 10:52 To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] [External] dynamic disk partitioning with xcat I've found it helpful to simply create a postscript called something like xcatenv: /usr/bin/env > /tmp/xcatenv ... And run it on a node w/ updatenode. This will give you all the postscripting environment variables. In your case, I believe the one you're looking for is $NODE, though you could also look at group affiliation as well. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: SOPORTE MODEMAT via xCAT-user mailto:xcat-user@lists.sourceforge.net>> Sent: Friday, March 5, 2021 13:40 To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Cc: SOPORTE MODEMAT mailto:soporte.mode...@epn.edu.ec>> Subject: [External] [xcat-user] dynamic disk partitioning with xcat Importance: High Hello guys. Could you please give me any idea to specify a dynamic disk partitioning based on the nodename? I think that a script for disk partitioning can help but I do not know which is the environmental variable that can help me to create an if clause or something like that. Thank you in advance for any help. I would really appreciate it. Soporte ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] xcat diskless guests: SSH Key change Warnings
2 tools that are handy in managing ssh on the cluster are “updatenode -k“ to rerun remoteshell on the nodes (makes sure the hosts got the correct keys during provisioning), and “makeknownhosts” to (re)build the known_hosts file. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Chiu, Peter (STFC,RAL,RALSP) Sent: Sunday, March 7, 2021 05:45 To: xCAT Users Mailing list Subject: [External] [xcat-user] xcat diskless guests: SSH Key change Warnings Hello all, We have recently upgraded the OS to Centos 7 with xcat diskless nodes. All are running fine as far as we can tell. Just one issue is that each time a diskless node reboots, psh or ssh command issued to the node will trigger a warning message on the ECDSA key change, please see below. The command does works. While we can simply remove the offending entry in .ssh/known_hosts, or through ssh-keygen –R ip_address to clear this for subsequent commands, I wonder if there is any way to prevent the incorrect reporting on the change of the ECDSA key. Clearly the host key has not changed in the boot image, somehow the caller thinks it has. Many thanks. Peter [root@aberdeen ~]# psh rsg15 date rsg15: @@@ rsg15: @WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ rsg15: @@@ rsg15: IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! rsg15: Someone could be eavesdropping on you right now (man-in-the-middle attack)! rsg15: It is also possible that a host key has just been changed. rsg15: The fingerprint for the ECDSA key sent by the remote host is rsg15: SHA256:jAAzf23XY1h+VPl7XYKcw9i68cVnw35ZeYDAG6z4SGw. rsg15: Please contact your system administrator. rsg15: Add correct host key in /root/.ssh/known_hosts to get rid of this message. rsg15: Offending ECDSA key in /root/.ssh/known_hosts:19 rsg15: Password authentication is disabled to avoid man-in-the-middle attacks. rsg15: Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks. rsg15: Sun 7 Mar 10:24:20 GMT 2021 This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. Opinions, conclusions or other information in this message and attachments that are not related directly to UKRI business are solely those of the author and do not represent the views of UKRI. ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] dynamic disk partitioning with xcat
I've found it helpful to simply create a postscript called something like xcatenv: /usr/bin/env > /tmp/xcatenv ... And run it on a node w/ updatenode. This will give you all the postscripting environment variables. In your case, I believe the one you're looking for is $NODE, though you could also look at group affiliation as well. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: SOPORTE MODEMAT via xCAT-user Sent: Friday, March 5, 2021 13:40 To: xCAT Users Mailing list Cc: SOPORTE MODEMAT Subject: [External] [xcat-user] dynamic disk partitioning with xcat Importance: High Hello guys. Could you please give me any idea to specify a dynamic disk partitioning based on the nodename? I think that a script for disk partitioning can help but I do not know which is the environmental variable that can help me to create an if clause or something like that. Thank you in advance for any help. I would really appreciate it. Soporte ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] nodeset and UEFI boot xnba files
Looking at the dhcpd.leases file, it appears the determining factor in which file a node gets when it PXE boots if the client-architecture. Unfortunately, I can't translate the different values I see in the leases file ("00:00" and "00:09"). So while I see your point, and not keeping both files up to date seems unintuitive on the surface, I don't know enough about how makedhcp is building this file and the conditions in which a node will receive the ".uefi" file. I have a cluster I'm working on with all xnba nodes, some stateful and some stateless. I don't see the uefi file mentioned in the console logs of any of them. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Thomas HUMMEL Sent: Tuesday, March 2, 2021 04:37 To: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] [External] nodeset and UEFI boot xnba files On 3/1/21 6:27 PM, Christian Caruthers wrote: > If it changes to something like: > > #!gpxe > #boot > Exit Yes it is > Of course, you should not configure a stateless node as "boot" I know that, I just tested the behavior relative to file content for both cases (stateless and stateful) > Stateful nodes shouldn't get anything from a PXE request unless they're being > (re)discovered or (re)installed. > If they're "full UEFI," they shouldn't be sending a PXE request after they're > installed as the OS would have inserted its own entry at the top of the boot > order during install. Correct. However I'd say that for consistency sake all files (.elilo and .uefi) should be changed as well don't you think ? Imagine a stateful node in production which for some reason gets its UEFI manually changed back to PXE first : it would get reinstalled instead of just booting as one could have expected having previously run nodeset boot. This seems counter intuitive and dangerous to me. The second point (unrelated to my original post) is that by changing UEFI boot order to disk first, a stateful install kind of detach itself from being xCAT handled. I mean, to reinstall them, you'd have to manually (or via ipmitool or some local action) set the order to PXE fist beforehand, which could be annoying if you have many of them while having a script issuing just an Exit (nodeset boot) would provide both possibilities (boot from disk or reinstall). So basically I think that it could be positive to - change .elilo and .uefi content as well when running nodeset - have a stateful install leave PXE first but I might not take into account some scenarii What do you think ? Thanks for you help -- Thomas HUMMEL ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] nodeset and UEFI boot xnba files
If it changes to something like: #!gpxe #boot Exit ... Then that is, to the best of my knowledge, the expected behavior when you run 'nodeset {node} boot'. Of course, you should not configure a stateless node as "boot" at all since it will always require a stateless image in response to a PXE request (after the initial boot kernel). Stateful nodes shouldn't get anything from a PXE request unless they're being (re)discovered or (re)installed. If they're "full UEFI," they shouldn't be sending a PXE request after they're installed as the OS would have inserted its own entry at the top of the boot order during install. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Thomas HUMMEL Sent: Monday, March 1, 2021 12:01 To: xcat-user@lists.sourceforge.net Subject: [External] [xcat-user] nodeset and UEFI boot xnba files Hello, I'm using xCAT 2.16.1 on CentOS 8.3.2011 to provision stateless and stateful CentOS 8.3 nodes booting in "full UEFI" mode (no DUAL nor Legacy mode). This works fine but I noticed today that the tftp xnba .uefi file is not touched by nodeset For instance, for a switch based discovered uefi booted stateless node foobar, xCAT generated the 2 following xnba files : /tftpboot/xcat/xnba/nodes/foobar /tftpboot/xcat/xnba/nodes/foobar.uefi holding the usual imgfetch, imgload and so on commands According to the generated lease file, I'd say that, for an UEFI boot, only /tftpboot/xcat/xnba/nodes/foobar.uefi is served: } elsif option user-class-identifier = "xNBA" and option client-architecture = 00:09 { supersede server.filename = "http://x.x.x.x:80/tftpboot/xcat/xnba/nodes/foobar.uefi;; But I noticed that, either for stateful or stateless case, nodeset foobar boot (for instance) only modifies /tftpboot/xcat/xnba/nodes/foobar, which seems wrong to me ? Can you help me figure out if this is an issue or if I'm missing something ? Thanks for your help -- Thomas HUMMEL ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] syncfiles not working at all
Running confluent-3.0.4-xcat-2.16.0.lenovo1, I have 2 service nodes that only handling tftpserver and nfsserver functions: nodels midway3-sn1 servicenode midway3-sn1: servicenode.nfsserver: 1 midway3-sn1: servicenode.node: midway3-sn1 midway3-sn1: servicenode.tftpserver: 1 midway3-sn1: servicenode.ntpserver: midway3-sn1: servicenode.disable: midway3-sn1: servicenode.conserver: midway3-sn1: servicenode.ftpserver: midway3-sn1: servicenode.ipforward: midway3-sn1: servicenode.nimserver: midway3-sn1: servicenode.comments: midway3-sn1: servicenode.monserver: midway3-sn1: servicenode.dhcpinterfaces: midway3-sn1: servicenode.proxydhcp: midway3-sn1: servicenode.nameserver: midway3-sn1: servicenode.ldapserver: midway3-sn1: servicenode.dhcpserver: The node is set up as such: Object name: midway3-0180 arch=x86_64 bmc=midway3-0180-ipmi bmcport=1 currstate=netboot centos8.2-x86_64-lenovo-compute groups=ipmi,compute,a5,ca5 mgt=ipmi netboot=xnba os=centos8.2 postbootscripts=otherpkgs postscripts=uoc-hostname,syslog,remoteshell,syncfiles,uoc-env,confignetwork,uoc-repoclean,uoc-computedisk,setupntp,rcc/configsysctl,rcc/config_compute_net,rcc/config_compute_misc,uoc-gpfs,uoc-slurm profile=lenovo-compute provmethod=Lenovo-compute-netboot-compute servicenode=midway3-mgt1 status=failed updatestatus=synced updatestatustime=02-17-2021 15:22:26 xcatmaster=midway3-sn1 The osimage definition: synclists=/install/custom/netboot/centos/lenovo.centos8.synclist That synclist file only contains: /etc/hosts -> /etc/ Initially, I had a common hosts file that cited groups for what needed to be synced over. In troubleshooting, I determined that syncfiles is not syncing anything over at all. While the node boots, I don't see /var/xcat get created on the service node when syncfiles runs. In /var/log/xcat/xcat.log on the node, I see: Fri Feb 19 07:15:52 CST 2021 [info]: xcat.deployment.postbootscript: postbootscript start..: syncfiles + '[' -n xcat.deployment.postbootscript ']' + log_label=xcat.deployment.postbootscript ++ basename ./syncfiles + bname=syncfiles + '[' -d /.statelite ']' + '[' -n 0 ']' + '[' 0 -eq 1 ']' + '[' -n '' ']' + RCP= + '[' '!' -e /usr/bin/rsync ']' + logger -t xcat.deployment.postbootscript -p local4.info 'Performing syncfiles postscript' ++ uname + osname=Linux + xcatpostdir=/xcatpost + logger -t xcat.deployment.postbootscript -p local4.info 'syncfiles: the OS name = Linux' + quit=no + count=5 + returncode=0 + '[' no = no ']' + cat /etc/os-release + grep -i cumulus + '[' Linux = Linux ']' ++ /xcatpost/startsyncfiles.awk -v RCP= + returncode=1 + '[' 1 -eq 0 ']' + '[' 5 -eq 0 ']' + let SLI=4224%10 + let SLI=SLI+10 + sleep 14 + let count=count-1 + '[' no = no ']' + cat /etc/os-release + grep -i cumulus + '[' Linux = Linux ']' ++ /xcatpost/startsyncfiles.awk -v RCP= + returncode=1 + '[' 1 -eq 0 ']' + '[' 4 -eq 0 ']' + let SLI=31872%10 + let SLI=SLI+10 + sleep 12 + let count=count-1 + '[' no = no ']' + cat /etc/os-release + grep -i cumulus + '[' Linux = Linux ']' ++ /xcatpost/startsyncfiles.awk -v RCP= + returncode=1 + '[' 1 -eq 0 ']' + '[' 3 -eq 0 ']' + let SLI=14810%10 + let SLI=SLI+10 + sleep 10 + let count=count-1 + '[' no = no ']' + cat /etc/os-release + grep -i cumulus + '[' Linux = Linux ']' ++ /xcatpost/startsyncfiles.awk -v RCP= + returncode=1 + '[' 1 -eq 0 ']' + '[' 2 -eq 0 ']' + let SLI=8430%10 + let SLI=SLI+10 + sleep 10 + let count=count-1 + '[' no = no ']' + cat /etc/os-release + grep -i cumulus + '[' Linux = Linux ']' ++ /xcatpost/startsyncfiles.awk -v RCP= + returncode=1 + '[' 1 -eq 0 ']' + '[' 1 -eq 0 ']' + let SLI=6784%10 + let SLI=SLI+10 + sleep 14 + let count=count-1 + '[' no = no ']' + cat /etc/os-release + grep -i cumulus + '[' Linux = Linux ']' ++ /xcatpost/startsyncfiles.awk -v RCP= + returncode=1 + '[' 1 -eq 0 ']' + '[' 0 -eq 0 ']' + quit=yes + let count=count-1 + '[' yes = no ']' + '[' 1 -eq 0 ']' + logger -t xcat.deployment.postbootscript -p local4.err 'syncfiles: Perform Syncing File action encountered error' + '[' 1 -eq 0 ']' Fri Feb 19 07:16:53 CST 2021 [info]: xcat.deployment.postbootscript: postbootscript end...:syncfiles return with 1 Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Switch based discovery with mix of splitted and non splitted ports
Long ago (>10years) I had a similar issue, w/ SMC switches I believe. Putting the exact output from snmpwalk (e.g. "Ethernet49/1")fixed it for me then. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Thomas HUMMEL Sent: Wednesday, February 17, 2021 08:56 To: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] [External] Switch based discovery with mix of splitted and non splitted ports On 2/17/21 2:41 PM, Christian Caruthers wrote: > What do you see if you run (assumes you're using SNMPv1 & "public" community > string): > > snmpwalk -v 1 -c public {SWITCH_NAME} .1.3.6.1.2.1.31.1.1.1.1 > > This should show how the switch reports the ports. As a matter of fact, it reports (please see attachement) as Ethernetn/p. Still the / or dotted syntax worked previously (granted not on the same switch model nor the same xCAT version) Thanks -- TH > > Regards, > Christian Caruthers > Lenovo Professional Services > Mobile: 757-289-9872 > > -Original Message- > From: Thomas HUMMEL > Sent: Wednesday, February 17, 2021 08:35 > To: xcat-user@lists.sourceforge.net > Subject: [External] [xcat-user] Switch based discovery with mix of > splitted and non splitted ports > > Hello, > > Currently, I'm using xCAT 2.16.1 on CentOS 8.2 to provision CentOS 8.3 > stateless nodes using switch based discovery. > > I've been doing this for many years with success. > > I always did that in either one of the 2 following cases : > > a) no switch port were using splitters > b) all switch ports were using splitter > > For b) I did successfully use either the n/x or the n.x syntax > > Today, I encountered what may seem an xCAT issue (not sure though, it could > be a switch configuration issue) for it is the first time I've got a switch > where I mix direct and split port attachement. > > The actual switch is an Arista 7050TX-72Q: 48x 1/10GbE (RJ 45) and 6x > 40GbE, where all nodes are rj45 attached except one which is connected > using one of the 4 link of a 40G port using a splitter > > What happens is the following : > > a) nodeA on port 1 was previously provisionned using switch-based > discovery without any problem > > b) I set up the node definition for nodeS which is connected to port > 49.1 (I used this dot-based syntax) > > -> nodeS is discovered with the name of nodeA and nodeA is assigned > nodeS mac adress > > so things get mixed up. > > Can you help me figure out > > - what syntax is the canonical one for split ports > > - if this is an xCAT problem or a switch problem > > ? > > Thanks for your help > > -- > Thomas HUMMEL > > > > ___ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://urldefense.com/v3/__https://lists.sourceforge.net/lists/listin > fo/xcat-user__;!!JFdNOqOXpB6UZW0!7dZN6eQ_2yzAFzPWZ34yOdGlqCWFB4iCVf76j > SJL0nECAzYU9o_0J_VEBrr8ZTP_IWunXQ$ > > > ___ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://urldefense.com/v3/__https://lists.sourceforge.net/lists/listin > fo/xcat-user__;!!JFdNOqOXpB6UZW0!7dZN6eQ_2yzAFzPWZ34yOdGlqCWFB4iCVf76j > SJL0nECAzYU9o_0J_VEBrr8ZTP_IWunXQ$ > ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Switch based discovery with mix of splitted and non splitted ports
What do you see if you run (assumes you're using SNMPv1 & "public" community string): snmpwalk -v 1 -c public {SWITCH_NAME} .1.3.6.1.2.1.31.1.1.1.1 This should show how the switch reports the ports. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Thomas HUMMEL Sent: Wednesday, February 17, 2021 08:35 To: xcat-user@lists.sourceforge.net Subject: [External] [xcat-user] Switch based discovery with mix of splitted and non splitted ports Hello, Currently, I'm using xCAT 2.16.1 on CentOS 8.2 to provision CentOS 8.3 stateless nodes using switch based discovery. I've been doing this for many years with success. I always did that in either one of the 2 following cases : a) no switch port were using splitters b) all switch ports were using splitter For b) I did successfully use either the n/x or the n.x syntax Today, I encountered what may seem an xCAT issue (not sure though, it could be a switch configuration issue) for it is the first time I've got a switch where I mix direct and split port attachement. The actual switch is an Arista 7050TX-72Q: 48x 1/10GbE (RJ 45) and 6x 40GbE, where all nodes are rj45 attached except one which is connected using one of the 4 link of a 40G port using a splitter What happens is the following : a) nodeA on port 1 was previously provisionned using switch-based discovery without any problem b) I set up the node definition for nodeS which is connected to port 49.1 (I used this dot-based syntax) -> nodeS is discovered with the name of nodeA and nodeA is assigned nodeS mac adress so things get mixed up. Can you help me figure out - what syntax is the canonical one for split ports - if this is an xCAT problem or a switch problem ? Thanks for your help -- Thomas HUMMEL ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] makehosts not creating aliases Possibly Bug 6828?
Running 2.16.0.lenovo1 on stock CentOS 8.2. I have the following setup: nodels login hosts.hostnames midway3-login1: login1,l1 midway3-login2: login2,l2 When I run makehosts login (w/ or w/o "-l"), these are not created: grep login /etc/hosts ##.##.##.## midway3-login1.rcc.local midway3-login1 ##.##.##.## midway3-login2.rcc.local midway3-login2 It appears to be related to bug 6828. Was there a workaround beyond duplicating efforts and defining the interface in the nics table? As a test, I added the following: nodels login nics.nicaliases midway3-login1: bond0!login1,l1 midway3-login2: bond0!login2,l2 After running makehosts: ##.##.##.## midway3-login1.rcc.local midway3-login1 login1 ##.##.##.## midway3-login2.rcc.local midway3-login2 login2 ... So it appears to have gotten one of the aliases. FWIW, if I remove the first alias, leaving l[1,2] as aliases, makehosts uses that entry: ##.##.##.## midway3-login1.rcc.local midway3-login1 l1 ##.##.##.## midway3-login2.rcc.local midway3-login2 l2 Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] XCAT 2.16 Discovery Mode - Genesis
Regarding "genesis" functionality: If you have a dynamic range defined in your provisioning network (dynamic range field of networks table), then xCAT will try to discover anything that PXE boots on that network. This assumes you have site.dhcpinterfaces defined to limit DHCP to the provisioning net. To disable automatic discovery, remove the dynamic range, and run "makedhcp -n". If you are trying to get recently installed cluster nodes to not PXE, the easiest way is modifying the bootorder. If you are using UEFI install, the OS should do this for you by inserting its own boot entry at the top of the list (e.g., "Red Hat Enterprise Linux"). Once that entry is there, the node should not be PXE booting unless the boot entry is corrupted. If you're using legacy boot, you can put the HDD before the network (pasu set BootOrder.BootOrder "Legacy Only=Hard Disk=PXE Network"). In either case, if you want to reinstall, you can use the rsetboot command (or nodeboot in confluent) to PXE on the next reboot. If recently installed nodes are still being discovered, my guess would be that the MAC is not discovered. A mac might be discovered, but notnecessarily the one it's trying to boot from. You can enter multiple macs for a node and specify NOIP for macs you don't want it to boot from (mac table entry: node,,"1:2:3:4|5:6:7:8!NOIP"). Make sure to run makedhcp after doing this. If all this doesn't work, what does "nodeset status" return? Perhaps the node's install status is not getting updated correctly. Regarding the ssh permissions, I'm not sure. It looks like the remoteshell postscript is slated to run. Perhaps check /var/log/xcat/ on the node for postscript logs to get a hint? What does /install/postscripts/_ssh look like on the management node? How about root's .ssh dir also on the management node? Regarding otherpkgs: The otherpkgs postscript should be under /install/postscripts with all postscripts. Things to remember when using otherpkgs: The otherpkgdir is a repo. Make sure you run createrepo on the dir after adding packages. The otherpkglist is a list of package names, not file names. So you would want "ssh-server" rather than "ssh-server-1.2.3-4.el7.x86_64.rpm". If you want the 32-bit version of a package, you can specify "ssh-server.i686" for example. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: James Ault via xCAT-user Sent: Sunday, August 9, 2020 23:50 To: xcat-user@lists.sourceforge.net Cc: James Ault Subject: [External] [xcat-user] XCAT 2.16 Discovery Mode - Genesis Hello. I am running XCAT 2.16 on CentOS 7 on an air-gapped network and I have a few questions: 1) how may I disable the "genesis" discovery functionality? A previous node management system had a default behavior of booting from network, and only the nodes that were marked for install would be installed, but all other nodes would boot from local disk with a slight delay. Many of these nodes are getting stuck in a boot loop where they boot genesis instead of their own OS from local disk, and the genesis process never completes, it just loops forever. I need to make this discovery process work properly or shut it off completely. One relevant error message on console: "xcat.genesis.minixcatd: The request is already processed by xCAT master, but not matched." (repeat every 5 minutes for ever) 2) I have successfully installed an OS (CentOS 7.8) on a few nodes, but the nodes somehow are configured with ssh key files that have permissions "640" which cause the sshd to fail and exit with an error, which means I cannot login remotely. This does not seem like a reasonable default. If there is a configuration setting in XCAT that will help me fix this before the OS install is finished that would be very helpful. 3) I want to install other packages during the post install process, but any attempts so far have not succeeded: Example typed from printed logs: xcatmn# lsdef node1 node1: arch=x86_64 bmc=10.10.10.10 bmcpassword=(enter_password_here) cons=ipmi consoleenabled=1 groups=all,x86_64 mac=1:2:3:4:5:6 mgt=ipmi netboot=xnba os=centos7.8 postbootscripts=otherpkgs postscripts=systlog,remoteshell,syncfiles profile=compute provmethod=centos7.8-x86_64-install-compute routenames=defaultroute status=failed status=(insert date here) xcatmn# lsdef -t osimage centos7.8-x86_64-install-compute imagetype=linux osarch=x86_64 osdistroname=centos7.8-x86_64 osname=Linux osvers=centos7.8 otherpkgdir=/install/post/otherpkgs/centos7.8/x86_64 otherpkglist=/install/custom/pkglist/compute.otherpkglist.txt partitionfile=/install/custom/partition/compute-default-partition.txt pkgdir=/install/centos7.8/x86_64 pkglist=/opt/xcat/share/xcat/inst
[xcat-user] confignetwork issue with default gw & dns in CentOS 8.1
Running 2.15.1-lenovo2 xCAT release on CentOS 8.1 Compute nodes have 2 NICs: ens3f0: 1Gb, provisioning net, 172.23.16.121/20, gw is .1 ens1f0: 50Gb highspeed, 172.25.16.121/24, no gw When I run "confignetwork" as a postbootscript, ens1f0 is set up correctly with an IP on the hifhspeed net and nothing else. If I run "confignetwork -s" as a postbootscript, both NICs are setup w/ IP, but both get the default gateway applied. Also, no DNS is ever set up. I've defined site.nameservers as well as network.nameservers & network.domain for the prov network. lsdef -t network prov Object name: prov dhcpserver= domain=hdp.local gateway=172.23.16.1 mask=255.255.240.0 mgtifname=ens2f0 mtu=1500 nameservers=172.23.16.242 net=172.23.16.0 ntpservers=172.23.16.1 tftpserver= lsdef -t network storage Object name: storage mask=255.255.255.0 mgtifname=ens2f0 mtu=1500 net=172.25.16.0 After running confignetwork -s, I cannot remove the errant gw setting from ens1f0 (nmcli c mod xcat-ens1f0 -ipv4.gateway 172.23.16.1). The only way I've found to fix ens1f0's gw setting is to set ipv4.never-default to yes and bounce the interface. To fix DNS, I have to add the DNS to ens3f0: nmcli c mod xcat-ens3f0 ipv4.dns {DNSSERVER} ipv4.dns-search {DOMAIN} and restart networkmanager to get things working correctly. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Networking: groups and nodes; nic* attributes
> Now that I type it out - and I read your answer to question 2, my presumption > is that it's the latter - since there is no way to remove the nic* attributes > in node01 without removing it from the group? Unless, perhaps, there is a nics table definition for node01 that overrides the group entry you're talking about. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Lachlan Simpson Sent: Tuesday, March 3, 2020 4:34 PM To: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] [External] Networking: groups and nodes; nic* attributes On Tue, 2020-03-03 at 15:27 +, Christian Caruthers wrote: > If chdef is not working as you expect, you can always use tabedit nics > to open the table in an editor and manually edit it. Be careful in > there, though. It's easy to put info into the wrong field. > > Not sure what you're asking in the first question. Can you send the > commands you ran along with the output of an lsdef for the node? > Re question 1, imagine this: lsdef node01 Object name: node01 arch=x86_64 bmc=10.197.32.01 cons=ipmi ip=10.197.34.01 mac=38:68:dd:13:f4:a0 mgt=ipmi netboot=xnba os=centos7.7 profile=compute status=booted Then I add node01 to a group called "other_network", which now has definition lsdef -t group other_network Object name: mpi members=node01 nicaliases.eno2=|node(\d+)|($1).mpi| nichostnamesuffixes.eno2=-mpi nicips.eno2=|node(\d+)|10.197.36.($1+0)| nicnetworks.eno2=mpi nictypes.eno2=ethernet So. When I now do lsdef on node01, does it look like the above, or does it look like this: lsdef node01 Object name: node01 arch=x86_64 bmc=10.197.32.01 cons=ipmi ip=10.197.34.01 mac=38:68:dd:13:f4:a0 mgt=ipmi netboot=xnba os=centos7.7 profile=compute status=booted nicaliases.eno2=node01.mpi nichostnamesuffixes.eno2=-mpi nicips.eno2=10.197.36.01 nicnetworks.eno2=mpi nictypes.eno2=Ethernet Now that I type it out - and I read your answer to question 2, my presumption is that it's the latter - since there is no way to remove the nic* attributes in node01 without removing it from the group? Cheers L. > Regards, > Christian Caruthers > Lenovo Professional Services > Mobile: 757-289-9872 > > -Original Message- > From: Lachlan Simpson > Sent: Monday, March 2, 2020 9:29 PM > To: xcat-user@lists.sourceforge.net > Subject: [External] [xcat-user] Networking: groups and nodes; nic* > attributes > > I have two questions: > > 1. If a node belongs to a group with nic* defined are those additions added > to the node's def? > > for eg, if I have an existing node *without* those nics defined, does > the installation/config process add those values to the node? (lsdef > before gives nothing, lsdef after has nic* entries) > > 2. How does one remove errant nicalias entries (for example, any nic* > key/value in reality) from a node's definition? Given that chdef -m/-p > doesn't work ( > https://xcat-docs.readthedocs.io/en/stable/advanced/domain_name_resolu > tion/domain_name_resolution.html#setting-individual-nic-attribute-valu > es > ) > > I have tried re-defining the values I wanted to change and just ended > up with two entries for nicaliases.ib0= in the node def? > > cheers > L. -- Lachlan Simpson Research Technology Services UNSW Research Technology Services Level 3, Chemical Sciences Building F10 UNSW SYDNEY NSW 2052 AUSTRALIA E: lachlan.simp...@unsw.edu.au W: https://research.unsw.edu.au ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: Networking: groups and nodes; nic* attributes
If you want node01-ib to be in DNS (assuming it's not handled by some external DNS server), then it needs to be in /etc/hosts either by manually adding it or via the hoses table and running makehosts. As for your second question, I'm no expert at regexp, but I believe you would have node01.ib. It would be pretty easy to test, though. Assuming node01 is the provisioning name of the node, and node01-ib is the desired name if its IPoIB interface, you would define node01 as a node and setup nics.nichostnamesuffixes as "-ib" for that node or node group. If the IPoIB network should have a different domain name than the provisioning net, then you can define that in networks.domain for the ib network. An example: nodels n01 nics -b n01: nics.nichostnamesuffixes: ib0!-ib (inherited from group compute) n01: nics.nicips: ib0!172.25.0.11 (inherited from group compute) n01: nics.nicnetworks: ib0!ibnet (inherited from group compute) n01: nics.nictypes: ib0!Infiniband (inherited from group compute) tabdump nics #node,nicips,nichostnamesuffixes,nichostnameprefixes,nictypes,niccustomscripts,nicnetworks,nicaliases,nicextraparams,nicdevices,nicsadapter,comments,disable "compute","|\D+(\d+).*$|ib0!172.25.0.(($1-0)+10)|","ib0!-ib",,"ib0!Infiniband",,"ib0!ibnet",, Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Lachlan Simpson Sent: Monday, March 2, 2020 9:44 PM To: xcat-user@lists.sourceforge.net Subject: [External] Re: [xcat-user] Networking: groups and nodes; nic* attributes Also, a follow up. If I have machine node01 nicips.ib0=10.1.1.6 nichostnamesuffixes.ib0=-ib Should create the /etc/hosts entry node01-ib 10.1.1.6 If I *also* have in the ib group (which node01 belongs to) nicaliases.ib0=|(k\d+)|($1).ib| does that create node01.ib OR node01-ib.ib? Cheers L. On Tue, 2020-03-03 at 02:28 +, Lachlan Simpson wrote: > I have two questions: > > 1. If a node belongs to a group with nic* defined are those additions added > to the node's def? > > for eg, if I have an existing node *without* those nics defined, does > the installation/config process add those values to the node? (lsdef > before gives nothing, lsdef after has nic* entries) > > 2. How does one remove errant nicalias entries (for example, any nic* > key/value in reality) from a node's definition? Given that chdef -m/-p > doesn't work ( > https://xcat-docs.readthedocs.io/en/stable/advanced/domain_name_resolu > tion/domain_name_resolution.html#setting-individual-nic-attribute-valu > es > ) > > I have tried re-defining the values I wanted to change and just ended > up with two entries for nicaliases.ib0= in the node def? > > cheers > L. -- Lachlan Simpson Research Technology Services UNSW Research Technology Services Level 3, Chemical Sciences Building F10 UNSW SYDNEY NSW 2052 AUSTRALIA E: lachlan.simp...@unsw.edu.au W: https://research.unsw.edu.au ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Networking: groups and nodes; nic* attributes
If chdef is not working as you expect, you can always use tabedit nics to open the table in an editor and manually edit it. Be careful in there, though. It's easy to put info into the wrong field. Not sure what you're asking in the first question. Can you send the commands you ran along with the output of an lsdef for the node? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Lachlan Simpson Sent: Monday, March 2, 2020 9:29 PM To: xcat-user@lists.sourceforge.net Subject: [External] [xcat-user] Networking: groups and nodes; nic* attributes I have two questions: 1. If a node belongs to a group with nic* defined are those additions added to the node's def? for eg, if I have an existing node *without* those nics defined, does the installation/config process add those values to the node? (lsdef before gives nothing, lsdef after has nic* entries) 2. How does one remove errant nicalias entries (for example, any nic* key/value in reality) from a node's definition? Given that chdef -m/-p doesn't work ( https://xcat-docs.readthedocs.io/en/stable/advanced/domain_name_resolution/domain_name_resolution.html#setting-individual-nic-attribute-values ) I have tried re-defining the values I wanted to change and just ended up with two entries for nicaliases.ib0= in the node def? cheers L. -- Lachlan Simpson Research Technology Services UNSW Research Technology Services Level 3, Chemical Sciences Building F10 UNSW SYDNEY NSW 2052 AUSTRALIA E: lachlan.simp...@unsw.edu.au W: https://research.unsw.edu.au ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Ofed install and nic config
I can't speak to the MOFED postscript, but for setting up nics, I've used the nics table and confignetwork -s (-s setting the boot interface to static): niccustomscripts.bond0=configbond bond0 ib0@ib2 mode=1@miimon=100 nicdevices.bond0=ib0|ib2 nichostnamesuffixes.bond0=-ib nicips.bond0=x.x.x.x nicnetworks.bond0=ib nictypes.bond0=bond nictypes.ib0=Infiniband nictypes.ib2=Infiniband This works in a recent customer site. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Huette, Antoine Sent: Thursday, February 27, 2020 10:50 AM To: xCAT Users list Subject: [External] [xcat-user] Ofed install and nic config Hello, The provided mlnxofed_ib_install script is not working for Centos 7.7 with the 4.7 iso files (tested with 4.7-1.0.0.1 and 4.7-3.2.9.0) When launching the script it stops after putting « nodeststate is« Also it is not clear if i need both confignetworks and confignics, because i need bonding on the internal network and ib0 configuration done when provisioning Nics config is not done properly because after bond0 is configured the node installer cannot ping the master anymore. Thanks for the help Best regards Antoine Huette HPC engineer Bechtle ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: DHCPLEASE for stateless cluster
If the systems have more than one IP address (e.g, provisioning & IPoIB), using the nics table along with either “confignics –s” or “confignetwork –s” will configure all interfaces as static. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: RUSSELL AULD Sent: Saturday, August 24, 2019 9:05 PM To: xCAT Users Mailing list Subject: [External] Re: [xcat-user] DHCPLEASE for stateless cluster The postscript is 'hardeths' On August 20, 2019 at 3:54 PM Kevin Keane mailto:kke...@sandiego.edu>> wrote: Remember that the DHCP in xCAT is quite substantially different from "normal" enterprise DHCP, since it needs to only serve a very specific, well-defined, set of computers (all of which normally will have static reservations), except during discovery. I would simply use whatever the default value is. Or if you do want to tweak it, go for a very long duration (to reduce traffic, even if it is only by a tiny amount). The other big difference between Enterprise DHCP and xCAT DHCP is that, because your xCAT environment is (or should be) very well-defined, it's very unlikely that the DHCP server just crashes out of the blue. If the DHCP server crashes, you probably have a bigger problem. BTW, a 3 day lease time wouldn't be enough to protect against a weekend outage. The lease time needs to be at least twice the longest expected outage. If you do want to protect against a DHCP failure anyway, you can do a few things: - Configure an automatic restart with a cron script or the like. - Configure the nodes to use a static IP address after bootup. There is a node attribute to do that, but I don't recall what it is off the top of my head. This is actually done in a postscript. ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 | Text: 760-721-8339 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them! On Mon, Aug 19, 2019 at 11:06 PM Heckes Frank (CI/OSB4) via xCAT-user < xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> wrote: Hello all, I’ve a couple of questions concerning the dhcpd configuration of a cluster runining completely on stateless images. -1- What is ‘best’ choice for the value of the DHCPLEASE parameter to run a stateless cluster? In case of dhcpd failures is seems to be helpful assigning a higher value then the default for at least 3 days seems to reasonable in case the dhcpd crashes over the weekend and an automated check/restarted fails.(?) -2- Does my understanding (dhcpd.leases(5)) and experience is right that the leases stored in /var/lib/dhcpd/dhcpd.leases guarantee a persistent state during dhcpd restarts? Many thanks in advance. Cheers, -Frank Heckes ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Hosts table not creating aliases
I created aliased shortened hostnames for the nodes in the cluster. Some of them are created when I run makehosts (ces nodes), and some are not (dss nodes). [root@fry-mgmt0 ~]# lsdef -v lsdef - Version 2.14.5.lenovo2 (git commit 06d7097f42eca03db70c9eb93b8abeaf8ca1c2be, built Tue Apr 16 20:08:56 UTC 2019) [root@fry-mgmt0 ~]# tabdump hosts | egrep 'ces|dss' #node,ip,hostnames,otherinterfaces,comments,disable "ces","|\D+(\d+).*$|192.168.1.($1+30)|","|\D+(\d+).*$|ces($1)|","|\D+(\d+).*$|-ipmi:192.168.2.($1+30)|",, "dssg-2-3","|\D+(\d+).*$|192.168.1.($1+50)|","|\D+(\d+).*$|dss($1)|","|\D+(\d+).*$|-ipmi:192.168.2.($1+50)|",, [root@fry-mgmt0 ~]# nodels fry-dss01 hosts fry-dss01: hosts.hostnames: dss01 ### Here's the alias fry-dss01: hosts.ip: 192.168.1.51 fry-dss01: hosts.node: fry-dss01 fry-dss01: hosts.otherinterfaces: -ipmi:192.168.2.51 fry-dss01: hosts.comments: fry-dss01: hosts.disable: [root@fry-mgmt0 ~]# nodels fry-ces01 hosts fry-ces01: hosts.hostnames: ces01 ### Here's the alias fry-ces01: hosts.ip: 192.168.1.31 fry-ces01: hosts.node: fry-ces01 fry-ces01: hosts.otherinterfaces: -ipmi:192.168.2.31 fry-ces01: hosts.comments: fry-ces01: hosts.disable: [root@fry-mgmt0 ~]# makehosts -l -n [root@fry-mgmt0 ~]# grep fry-ces01 /etc/hosts 192.168.1.31 fry-ces01.fry.cluster.local fry-ces01 ces01 ### This is what I am trying to do, and it works here 192.168.2.31 fry-ces01-ipmi.fry.cluster.local fry-ces01-ipmi [root@fry-mgmt0 ~]# grep fry-dss01 /etc/hosts 192.168.1.51 fry-dss01.fry.cluster.local fry-dss01 ### Alias doesn't get created! 192.168.2.51 fry-dss01-ipmi.fry.cluster.local fry-dss01-ipmi Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: Includes not parsing when running nodeset
From what I can tell, it’s not pulling it from my environment, though it is set by the xcat.sh file in /etc/profile.d installed by the xCAT rpm): [root@xcat ~]# cat /etc/profile.d/xcat.sh XCATROOT=/opt/xcat PATH=$XCATROOT/bin:$XCATROOT/sbin:$XCATROOT/share/xcat/tools:$PATH MANPATH=$XCATROOT/share/man:$MANPATH export XCATROOT PATH MANPATH export PERL_BADLANG=0 [root@xcat ~]# echo ">>$XCATROOT<<" >>/opt/xcat<< Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane Sent: Tuesday, June 11, 2019 3:24 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] [External] Re: Includes not parsing when running nodeset I notice that apparently, XCATROOT also does not pick up any of the default values in that statement. Maybe XCATROOT is set, but to a blank string? Can you do echo ">>$XCATROOT<<" immediately before running restartxcatd? Also make sure that the XCATROOT variable is exported, not just set. ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 | Text: 760-721-8339 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them! On Tue, Jun 11, 2019 at 11:31 AM Christian Caruthers mailto:ccaruth...@lenovo.com>> wrote: Looking in those files, I see in xcatd: $::XCATROOT = $ENV{'XCATROOT'} ? $ENV{'XCATROOT'} : '/opt/xcat'; And in restartxcatd: BEGIN { $::XCATROOT = $ENV{'XCATROOT'} ? $ENV{'XCATROOT'} : -d '/opt/xcat' ? '/opt/xcat' : '/usr'; } … So I’m not sure why they are not being set. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Mark Gurevich mailto:gurev...@us.ibm.com>> Sent: Tuesday, June 11, 2019 2:27 PM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: [External] Re: [xcat-user] Includes not parsing when running nodeset I think XCATROOT gets set by "sourcing" /etc/profile.d/xcat.sh from /opt/xcat/sbin/xcatd and /opt/xcat/sbin/restartxcatd Mark Gurevich Poughkeepsie Development Lab HPC Software Development - xCAT "If we knew what it was we were doing, it would not be called research, would it?" --Albert Einstein [Inactive hide details for Christian Caruthers ---06/11/2019 01:58:44 PM---After some looking around, it appears that the xcat d]Christian Caruthers ---06/11/2019 01:58:44 PM---After some looking around, it appears that the xcat daemon is not picking up the XCATROOT environmen From: Christian Caruthers mailto:ccaruth...@lenovo.com>> To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: 06/11/2019 01:58 PM Subject: Re: [xcat-user] [External] Includes not parsing when running nodeset After some looking around, it appears that the xcat daemon is not picking up the XCATROOT environment variable. Looking at the system script, it checks for a file /etc/sysconfig/xcat. This file does not exist, and I cannot find it in the xCAT RPMs, so I’m not sure why it’s being called to. Either way, adding XCATROOT=/opt/xcat, and restarting the daemon appears to have resolved the issue. Big question: It’s still not clear what was causing xcatd to ignore XCATROOT. As I said, the /etc/sysconfig/xcat file doesn’t appear in my xCAT rpms. Is it created by a scriptlet or something? From: Christian Caruthers mailto:ccaruth...@lenovo.com>> Sent: Tuesday, June 11, 2019 12:26 PM To: xCAT Users Mailing list (xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>) mailto:xcat-user@lists.sourceforge.net>> Subject: [External] [xcat-user] Includes not parsing when running nodeset When I run nodeset osimage=rhels7.5-x86_64-install-compute #INCLUDEBAD:cannot open /share/xcat/install/scripts/pre.rh.rhels7# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.xcat# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.rhels7# This is the default xCAT template that has been untouched. The XCATROOT variable is set to /opt/xcat Running Version 2.14.3.lenovo5 (git commit 06d7097f42eca03db70c9eb93b8abeaf8ca1c2be) ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: Includes not parsing when running nodeset
Looking in those files, I see in xcatd: $::XCATROOT = $ENV{'XCATROOT'} ? $ENV{'XCATROOT'} : '/opt/xcat'; And in restartxcatd: BEGIN { $::XCATROOT = $ENV{'XCATROOT'} ? $ENV{'XCATROOT'} : -d '/opt/xcat' ? '/opt/xcat' : '/usr'; } … So I’m not sure why they are not being set. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Mark Gurevich Sent: Tuesday, June 11, 2019 2:27 PM To: xCAT Users Mailing list Subject: [External] Re: [xcat-user] Includes not parsing when running nodeset I think XCATROOT gets set by "sourcing" /etc/profile.d/xcat.sh from /opt/xcat/sbin/xcatd and /opt/xcat/sbin/restartxcatd Mark Gurevich Poughkeepsie Development Lab HPC Software Development - xCAT "If we knew what it was we were doing, it would not be called research, would it?" --Albert Einstein [Inactive hide details for Christian Caruthers ---06/11/2019 01:58:44 PM---After some looking around, it appears that the xcat d]Christian Caruthers ---06/11/2019 01:58:44 PM---After some looking around, it appears that the xcat daemon is not picking up the XCATROOT environmen From: Christian Caruthers mailto:ccaruth...@lenovo.com>> To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: 06/11/2019 01:58 PM Subject: Re: [xcat-user] [External] Includes not parsing when running nodeset After some looking around, it appears that the xcat daemon is not picking up the XCATROOT environment variable. Looking at the system script, it checks for a file /etc/sysconfig/xcat. This file does not exist, and I cannot find it in the xCAT RPMs, so I’m not sure why it’s being called to. Either way, adding XCATROOT=/opt/xcat, and restarting the daemon appears to have resolved the issue. Big question: It’s still not clear what was causing xcatd to ignore XCATROOT. As I said, the /etc/sysconfig/xcat file doesn’t appear in my xCAT rpms. Is it created by a scriptlet or something? From: Christian Caruthers mailto:ccaruth...@lenovo.com>> Sent: Tuesday, June 11, 2019 12:26 PM To: xCAT Users Mailing list (xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>) mailto:xcat-user@lists.sourceforge.net>> Subject: [External] [xcat-user] Includes not parsing when running nodeset When I run nodeset osimage=rhels7.5-x86_64-install-compute #INCLUDEBAD:cannot open /share/xcat/install/scripts/pre.rh.rhels7# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.xcat# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.rhels7# This is the default xCAT template that has been untouched. The XCATROOT variable is set to /opt/xcat Running Version 2.14.3.lenovo5 (git commit 06d7097f42eca03db70c9eb93b8abeaf8ca1c2be) ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Includes not parsing when running nodeset
After some looking around, it appears that the xcat daemon is not picking up the XCATROOT environment variable. Looking at the system script, it checks for a file /etc/sysconfig/xcat. This file does not exist, and I cannot find it in the xCAT RPMs, so I'm not sure why it's being called to. Either way, adding XCATROOT=/opt/xcat, and restarting the daemon appears to have resolved the issue. Big question: It's still not clear what was causing xcatd to ignore XCATROOT. As I said, the /etc/sysconfig/xcat file doesn't appear in my xCAT rpms. Is it created by a scriptlet or something? From: Christian Caruthers Sent: Tuesday, June 11, 2019 12:26 PM To: xCAT Users Mailing list (xcat-user@lists.sourceforge.net) Subject: [External] [xcat-user] Includes not parsing when running nodeset When I run nodeset osimage=rhels7.5-x86_64-install-compute #INCLUDEBAD:cannot open /share/xcat/install/scripts/pre.rh.rhels7# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.xcat# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.rhels7# This is the default xCAT template that has been untouched. The XCATROOT variable is set to /opt/xcat Running Version 2.14.3.lenovo5 (git commit 06d7097f42eca03db70c9eb93b8abeaf8ca1c2be) ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Includes not parsing when running nodeset
When I run nodeset osimage=rhels7.5-x86_64-install-compute #INCLUDEBAD:cannot open /share/xcat/install/scripts/pre.rh.rhels7# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.xcat# #INCLUDEBAD:cannot open /share/xcat/install/scripts/post.rhels7# This is the default xCAT template that has been untouched. The XCATROOT variable is set to /opt/xcat Running Version 2.14.3.lenovo5 (git commit 06d7097f42eca03db70c9eb93b8abeaf8ca1c2be) ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] DNS reverse lookup issues
Running Version 2.14.1.lenovo2 on RHEL 7.4. I have the following "external" network defined: Object name: DLS domain=domain.net gateway=192.168.100.254 mask=255.255.255.0 mgtifname=eno1 nameservers=192.168.100.1 net=192.168.100.0 site.forwarders = 192.168.100.1 site.domain=cluster.domain.net resolv.conf has the MN as the nameserver: domain cluster.domain.net search cluster.comain.net domain.net nameserver 172.23.100.10 ... And for the most part, everything works except reverse lookup for IPs not managed by this DNS server: host mn01 mn01.cluster.domain.net has address 172.23.100.10 host 172.23.100.11 11.100.23.172.in-addr.arpa domain name pointer n01.cluster.domain.net. host compnode01.domain.net compnode01.domain.net has address of 192.168.100.11 host 192.168.100.10 host 192.168.100.10.in-addr-arpa. not found: 3(NXDOMAIN) host 192.168.100.10 192.168.100.1 Using domain server: Name: 192.168.100.1 Address: 192.168.100.1#53 Aliases: 192.168.100.10.in-addr.arpa domain name pointer mgt01.domain.net So, if I give it the remote DNS server, the reverse lookup works. Using the xCAT MN, reverse lookup only works for internal IPs. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] How to restrict xCAT's NFS shares?
I believe that is created when xCAT is installed. Not sure which RPM does it, though. Possible the main xCAT or xCAT-server package. I don’t see the file in any of the packages, so I’m guessing it’s created by a script. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane Sent: Wednesday, November 28, 2018 17:26 To: xCAT Users Mailing list Subject: Re: [xcat-user] [External] How to restrict xCAT's NFS shares? My question is actually, how does the /etc/exports get generated, and how do I get xCAT to generate the exports file without the world-writable permissions? Thanks, ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them! On Wed, Nov 28, 2018 at 1:50 PM Christian Caruthers mailto:ccaruth...@lenovo.com>> wrote: So long as the shares are available to your provisioning network, it should not break anything. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane mailto:kke...@sandiego.edu>> Sent: Wednesday, November 28, 2018 16:37 To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: [External] [xcat-user] How to restrict xCAT's NFS shares? I noticed that xCAT shares /tftpboot and /install as world-writeable. Is there a way to restrict these NFS shares to only the networks within the cluster, without making them globally available? Specifically, xCAT creates this /etc/exports file: /tftpboot *(rw,no_root_squash,sync,no_subtree_check) /install *(rw,no_root_squash,sync,no_subtree_check) I would like it to instead create this: /tftpboot 192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)> /tftpboot 192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)> /install 192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)> /install 192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)> (where 192.168.10.0 and 192.168.11.0 are two networks defined in the network table) Is that doable? Thanks! ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them! ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: Excluding a node from discovery?
If you define the mac address and set the chain table so xCAT will not try to "do" anything with the node: chdef {NODE} currstate=boot currchain=boot chain=boot nodeset {NODE} boot You probably don't need all of those chain table settings, but that should catch everything. This way, the DHCP server will not attempt to discover or install when a PXE request is received. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Casandra H Qiu Sent: Wednesday, November 28, 2018 15:12 To: xCAT Users Mailing list Subject: [External] Re: [xcat-user] Excluding a node from discovery? If mac address defined, DHCP server already knows this node. bmcdiscover will not send genesis kernel , but the node attribute will be replaced from discover packet if there are difference. For the mtms-based discover, the bmcdiscover will send findme packet and looking for per-definied node with same mtm/serial. Yes, if it didn't find, that would cause bmcdiscover to think this node hasn't been discovered, the temp bmc discover will not be removed from xcat database. The discover packet should contain same info as what defined for the storage node (CPU, disksize, mtm/serial, memory), i think you should just keep mac, mtm/serial for the storage node, it will not run into discovery process. ... Casandra Hong Qiu Phone: (845) 433-9291, t/l 293-9291 Office: Building 8, 3-B-04 cxh...@us.ibm.com<mailto:cxh...@us.ibm.com> [Inactive hide details for Kevin Keane ---11/28/2018 02:18:03 PM---If I remove mtms and serial from the storage node definition,]Kevin Keane ---11/28/2018 02:18:03 PM---If I remove mtms and serial from the storage node definition, how would that cause bmcdiscover to ig From: Kevin Keane mailto:kke...@sandiego.edu>> To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: 11/28/2018 02:18 PM Subject: Re: [xcat-user] Excluding a node from discovery? If I remove mtms and serial from the storage node definition, how would that cause bmcdiscover to ignore this node? It seems to me that in the contrary, that would cause bmcdiscover to think this node hasn't been discovered yet. But your response inspired a thought - if I do the opposite and *add* mtms, serial and MAC address to the storage node object, would that be enough to get bmcdiscover to think this node has already been discovered? Or do I need any other settings? _ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them! On Wed, Nov 28, 2018 at 11:02 AM Casandra H Qiu mailto:cxh...@us.ibm.com>> wrote: You may need to remove mtms/serial number from storage node definition. Also, you should remove mac address from storage node definition, "makedhcp -d storagenode " to remove from DHCP lease file Thanks, Casandra Qiu ... Casandra Hong Qiu Phone: (845) 433-9291, t/l 293-9291 Office: Building 8, 3-B-04 cxh...@us.ibm.com<mailto:cxh...@us.ibm.com> Kevin Keane ---11/28/2018 01:03:36 PM---I am looking for a way to exclude one node from being discovered (MTMS-based discovery). From: Kevin Keane mailto:kke...@sandiego.edu>> To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: 11/28/2018 01:03 PM Subject: [xcat-user] Excluding a node from discovery? I am looking for a way to exclude one node from being discovered (MTMS-based discovery). The background is that we have quite a few compute nodes, and one storage node. The storage node is managed separately, but is of course connected to the same networks as the compute nodes. If I blindly run bmcdiscover on the whole subnet, it will discover the storage node. The worst-case scenario is that I accidentally reformat it and lose data. So I am looking for a way to keep this node from ever even being discovered in the first place. Any ideas? Thanks! ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click
Re: [xcat-user] [External] How to restrict xCAT's NFS shares?
So long as the shares are available to your provisioning network, it should not break anything. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane Sent: Wednesday, November 28, 2018 16:37 To: xCAT Users Mailing list Subject: [External] [xcat-user] How to restrict xCAT's NFS shares? I noticed that xCAT shares /tftpboot and /install as world-writeable. Is there a way to restrict these NFS shares to only the networks within the cluster, without making them globally available? Specifically, xCAT creates this /etc/exports file: /tftpboot *(rw,no_root_squash,sync,no_subtree_check) /install *(rw,no_root_squash,sync,no_subtree_check) I would like it to instead create this: /tftpboot 192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)> /tftpboot 192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)> /install 192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.10.0/24(rw,no_root_squash,sync,no_subtree_check)> /install 192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)<http://192.168.11.0/24(rw,no_root_squash,sync,no_subtree_check)> (where 192.168.10.0 and 192.168.11.0 are two networks defined in the network table) Is that doable? Thanks! ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 REMEMBER! No one from IT at USD will ever ask to confirm or supply your password. These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them! ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Encrypted passwords in passwd table
Looking to set up encrypted passwords, and the only documentation I see it on the old SF site: https://sourceforge.net/p/xcat/wiki/Encrypted_root_password_in_passwd.tab/ Is there any newer documentation? I didn't see it on the readthedocs site. Also, does this only work for the root password, or can it also be used for IPMI? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Network interface naming
There is a way to make CentOS/RHEL 7 not use consistent net device naming. That’s outlined here: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/sec-disabling_consistent_network_device_naming For boot parameters, you can populate the addkcmdline field in the bootparams table. As for handling your devices in different nodes, you add them to either a centos6 or centos7 group and populate the nics table accordingly by the group membership (i.e.: centos6 would use “eth0” and centos7 would use “eno1”). Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: langton nkiwane Sent: Monday, November 5, 2018 08:12 To: xcat-user@lists.sourceforge.net Subject: [External] [xcat-user] Network interface naming I have images for centos 6 and some for centos 7. The network naming is different since for centos 6 is em# and for centos 7 is eno#. How do I register these interfaces in the respective tables in xcat or how do I update kernel options in xcat for a respective image. Regards ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] [External] Re: rexec line 138: Deprecated option KeyRegenerationInterval
This information would be very useful in the xCAT docs! Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Casandra H Qiu Sent: Tuesday, September 25, 2018 16:19 To: xCAT Users Mailing list Subject: [External] Re: [xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval How did u define synclists ? To debug startsyncfiles.awk on the compute node: 1) export USEOPENSSLFORXCAT=1 2) export XCATSERVER=:3001 3) modify /xcatpost/startssyncfiles.awk and add "print $0" after while loop before if match 4) run: "/xcatpost/startsyncfiles.awk -v RCP=/usr/bin/rsync" it will show error message. Thanks, Casandra ... Casandra Hong Qiu Phone: (845) 433-9291, t/l 293-9291 Office: Building 8, 3-B-04 cxh...@us.ibm.com<mailto:cxh...@us.ibm.com> [Inactive hide details for Mike Marsh ---09/25/2018 03:43:30 PM---Hi Casandra, It's not a hierarchy cluster..]Mike Marsh ---09/25/2018 03:43:30 PM---Hi Casandra, It's not a hierarchy cluster.. From: Mike Marsh mailto:marsh_m...@cat.com>> To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: 09/25/2018 03:43 PM Subject: Re: [xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval Hi Casandra, It’s not a hierarchy cluster.. I did find the lines to comment out the “KeyRegenerationInterval” lines in the “remoteshell” script. Now I am not getting the error message. But that didn’t fix the files not syncing to the compute nodes.. Just started looking at the syncfiles script.. The syncfiles script, calls an awk script “startsyncfiles.awk” Not the best with awk , will take me some time to work through that one.. Thanks Mike… Caterpillar: Confidential Green From: Casandra H Qiu [mailto:cxh...@us.ibm.com] Sent: Tuesday, September 25, 2018 2:31 PM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval if it is hierarchy cluster, need to run `updatenode` command to sync files to service node first then provision the compute node. "updatenode compute -F" will sync file after compute node is up. Thanks, Casandra ... Casandra Hong Qiu Phone: (845) 433-9291, t/l 293-9291 Office: Building 8, 3-B-04 cxh...@us.ibm.com<mailto:cxh...@us.ibm.com> [Inactive hide details for Mike Marsh ---09/25/2018 03:07:25 PM---Hello, Yes, the node does come up, and is running rhel7.4..]Mike Marsh ---09/25/2018 03:07:25 PM---Hello, Yes, the node does come up, and is running rhel7.4.. From: Mike Marsh mailto:marsh_m...@cat.com>> To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: 09/25/2018 03:07 PM Subject: Re: [xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval Hello, Yes, the node does come up, and is running rhel7.4.. I can ssh to the node okay. The only piece not working, is the “syncfiles” does not sync the files to the compute node. I am thinking now, the message “Deprecated option KeyRegenerationInterval” may be just cosmetic. I am now wondering if the problem exists in the syncfiles script.. I haven’t looked at that script yet. The error message “[xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval” may have pointed me in the wrong direction.. I will take a look at the syncfiles script… Thanks Mike Marsh Caterpillar: Confidential Green From: Ezell, Matthew A. [mailto:ezel...@ornl.gov] Sent: Tuesday, September 25, 2018 1:57 PM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval This option comes from the remoteshell postscript. Removed with: https://github.com/xcat2/xcat-core/pull/4599 back in April. You can upgrade xCAT or just manually patch the remoteshell file. ~Matt --- Matt Ezell HPC Systems Administrator Oak Ridge National Laboratory From: Kevin Keane mailto:kke...@sandiego.edu>> Reply-To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Date: Tuesday, September 25, 2018 at 2:23 PM To: xCAT Users Mailing list mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] rexec line 138: Deprecated option KeyRegenerationInterval I suspect that you have two separate problems. The message about Deprecated option KeyRegenerationInterval message comes from either sshd or ssh (you can tell by when you see it - either when sshd starts, or when you try to use ssh). It should have no effect; the option will just get ignored. The only exception is if the ssh/sshd configuration is not workable without that option. To avoid the message, see Casandra's suggestion, but that's probably just cosmetic
Re: [xcat-user] [External] What is the best way for changing/maintain users/groups/passwords for the computing nodes?
Some suggestions: Rather than sync'ing the passwd, group, and shadow files to the systems, use a postscript to simply appended what you need to those files. Set the xCAT management node up as an NIS server. Set up ansible on xCAT MN to manage/create user accounts. Connect to LDAP or AD domain. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Daniel Hilst Selli Sent: Monday, June 18, 2018 12:56 To: xCAT Users Mailing list Subject: [External] [xcat-user] What is the best way for changing/maintain users/groups/passwords for the computing nodes? Hi! I had a problem where I couldn't login to a computing node with the password contained at system key of passwd table. I search in the internet for options on setting password for xcat. The documentation says chtab key=system passwd.username=root passwd.password=abc123 But I don't really understand how this password would get to /etc/shadow of the computing nodes. Changing the password and reboot stateless node doesn't has effect, the node keep using the old password and passwd table and nodes /etc/shadow are out of sync. I saw people on internet synchronizing /etc/{group,shadow,passwd} from management node, but if this is the case, what is the point of the system key on passwd table? Any suggestion on how to handle computing node users will be appreciated! Regards, -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] (no subject)
Can you paste the contents of the networks table entry for the 148.187.x.xx network you’re using for the lcg.cscs.ch network/subdomain? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Carmelo Ponti (CSCS) [mailto:cpo...@cscs.ch] Sent: Thursday, March 15, 2018 07:04 To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Cc: Conciatore Dino <conciat...@cscs.ch> Subject: Re: [xcat-user] (no subject) Dear Bin and Christian, Thank you very much for your prompt answer. Usually at CSCS we are using xCAT to install specific clusters and we never faced out this problem. In this case we are trying to use xCAT to install general purpose servers, small cluster or virtual machines so we have a very heterogeneous environment. As you pointed out we tried to install servers located on a different subnet with a different subdomain (e.g.: bar..cscs.ch). Yesterday our fist configuration was the following: hosts.csv:"arc06","148.187.xxx.xxx","arc06.lcg.cscs.ch",,, mac.csv:"arc06","ens32","00:50:56:xx:xx:xx",, nodehm.csv:"arc06","ipmi","ipmi","kvm",,, nodelist.csv:"arc06","all,cscs","booting","03-15-2018 10:02:31", noderes.csv:"arc06",,"pxe","148.187.x.xx","/tftpboot","148.187.x.xx""ens32" nodetype.csv:"arc06","centos7.4","x86_64","cscs","centos7.4-x86_64-install-cscs" This is the way I always used since years but in this case failed. After some tests we tried to configure again arc06 as following: hosts.csv:"arc06.lcg.cscs.ch","148.187.xxx.xxx","arc06.lcg.cscs.ch",,, mac.csv:"arc06.lcg.cscs.ch","ens32","00:50:56:xx:xx:xx",, nodehm.csv:"arc06.lcg.cscs.ch","ipmi","ipmi","kvm",,, nodelist.csv:"arc06.lcg.cscs.ch","all,cscs",,, noderes.csv:"arc06.lcg.cscs.ch",,"pxe","148.187.x.xx","/tftpboot","148.187.x.xx""ens32" nodetype.csv:"arc06.lcg.cscs.ch","centos7.4","x86_64","compute","centos7.4-x86_64-install-compute",,"osi",, And now we managed to install the server without problem. Apparently we fixed our problem but it would be nice to hear your comments/suggestions too. Thank you, Carmelo On Thu, 2018-03-15 at 05:49 +, Bin XA Xu wrote: Hi Carmelo, Thanks to give us the information, but pointed out by Christian, we'd like to know more about your scenario there. It seems you want some nodes' FQDN are foo.cscs.ch, and some of them are bar..cscs.ch. Are those nodes in the same subnet? and why we want the different domain, just for split a big cluster into some small ones? Bin Xu HPC Software Development Software Defined Infrastructure, IBM Systems Phone: 86-010-82454067 E-mail: bx...@cn.ibm.com<mailto:bx...@cn.ibm.com> - Original message - From: Christian Caruthers <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Cc: Subject: Re: [xcat-user] (no subject) Date: Wed, Mar 14, 2018 9:34 PM Are you asking about multi-homed cluster clients or whether xCAT can manage a client that does not have an interface on the provisioning network? How do these different domains translate to networks? Are they all on the same network, or separated into different subnets? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Carmelo Ponti (CSCS) [mailto:cpo...@cscs.ch] Sent: Wednesday, March 14, 2018 07:25 To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net> Cc: Conciatore Dino <conciat...@cscs.ch<mailto:conciat...@cscs.ch>> Subject: [xcat-user] (no subject) Hello, at CSCS we would like to use the same xcat server to install different servers with different domains. With the main domain defined in /etc/xcat/site.sqlite we don't have any problem: # tabdump site | grep domain "domain","cscs.ch",, All servers with a name or an alias .cscs.ch are working perfectly. But if we try to install a new server with another domain (e.g.: hostname.something.cscs.ch), this will failed. I would like to know if it's possible to define multiple domains on the same xcat server. Thank you in advance, Carmelo Ponti -- -- Carmelo Ponti System Engineer CSCSSwiss Center for Scientific Computing Via Trevano 131 Email: cpo...@cscs.ch<mailto:c
Re: [xcat-user] (no subject)
Are you asking about multi-homed cluster clients or whether xCAT can manage a client that does not have an interface on the provisioning network? How do these different domains translate to networks? Are they all on the same network, or separated into different subnets? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -Original Message- From: Carmelo Ponti (CSCS) [mailto:cpo...@cscs.ch] Sent: Wednesday, March 14, 2018 07:25 To: xcat-user@lists.sourceforge.net Cc: Conciatore Dino <conciat...@cscs.ch> Subject: [xcat-user] (no subject) Hello, at CSCS we would like to use the same xcat server to install different servers with different domains. With the main domain defined in /etc/xcat/site.sqlite we don't have any problem: # tabdump site | grep domain "domain","cscs.ch",, All servers with a name or an alias .cscs.ch are working perfectly. But if we try to install a new server with another domain (e.g.: hostname.something.cscs.ch), this will failed. I would like to know if it's possible to define multiple domains on the same xcat server. Thank you in advance, Carmelo Ponti -- -- Carmelo Ponti System Engineer CSCSSwiss Center for Scientific Computing Via Trevano 131 Email: cpo...@cscs.ch CH-6900 Lugano http://www.cscs.ch Phone: +41 91 610 82 15/Fax: +41 91 610 82 82 -- -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Local scratch for stateless compute nodes
We’ve used similar scripts in the past without any checks to prevent unintended disasters. It would be pretty easy to use an if or case statement to ensure anything destructive only happens on the right systems. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Vinícius Ferrão [mailto:fer...@versatushpc.com.br] Sent: Monday, November 27, 2017 2:17 PM To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: Re: [xcat-user] Local scratch for stateless compute nodes Very good info indeed. I will be looking on the script, Kevin. It would be sufficient for a while, but about the feature, it would be nice to be fixed/documented/explained how to use it, because it’s a pretty common use case of stateless nodes. Perhaps someone on the dev team can look at this? Should we open a ticket on the issue tracker? V. On 27 Nov 2017, at 17:02, Gilad Berman <gber...@lenovo.com<mailto:gber...@lenovo.com>> wrote: THX Kevin for the tip!! We actually used similar method, but the first post on the thread reminded me of the localdisk feature and I thought it can be very nice to use it, if working. Gilad Berman HPC Architect Lenovo EMEA +972-52-2554262 gber...@lenovo.com<mailto:gber...@lenovo.com> Lenovo.com <http://www.lenovo.com/> Twitter<http://twitter.com/lenovo> | Facebook<http://www.facebook.com/lenovo> | Instagram<https://instagram.com/lenovo> | Blogs<http://blog.lenovo.com/> | Forums<http://forums.lenovo.com/> From: Kevin Keane [mailto:kke...@sandiego.edu] Sent: Monday, November 27, 2017 7:22 PM To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Subject: Re: [xcat-user] Local scratch for stateless compute nodes To address this, we are using the syncfiles mechanism to copy an rc.local file into the compute node (we could probably also put it directly into the image) This rc.local contains statements to mount the /tmp volume. We originally also used it to partition and format the physical hard disk, but that proved too dangerous when somebody accidentally ran the script on the management node and wiped out the partition table... You could probably do something similar with a swap partition. Here is the script we are using. A few notes: - a prerequisite is an entry that mounts /dev/sda1 as /localscratch in fstab. - moving the content from /tmp to /localscratch/tmp actually isn't working flawlessly; it is just good enough for our purposes. #!/bin/sh # # This script will be executed *after* all the other init scripts. # You can put your own initialization stuff in here if you don't # want to do the full Sys V style init stuff. touch /var/lock/subsys/local # Let's see if the local disk is already formatted and set up - if so, # we won't redo it. if [ -d /localscratch/tmp ] then echo "Localscratch disk is already formatted" else umount /dev/sda1 ## # DANGER DANGER DANGER! This code can, and will, blindly destroy # partitions on whatever computer it is run. Partition recovery # is not easy. ## # Format the disk in the compute node. Single partition, mounted # as localscratch. #dd if=/dev/zero of=/dev/sda bs=1M count=100 #parted -s /dev/sda mklabel gpt #parted -s -a optimal /dev/sda mkpart primary ext3 0% 100% #mkfs -t ext3 /dev/sda1 mount -a fi # Create the /tmp and /var/tmp directories. for i in /tmp /var/tmp /var/log do # Make sure mv includes .dotfiles shopt -s dotglob mkdir -p /localscratch$i chmod 755 /localscratch case "$i" in /tmp) # The first digit in the mode is the sticky bit. chmod 1777 /localscratch$i ;; /var/tmp) # The first digit in the mode is the sticky bit. chmod 1777 /localscratch$i ;; *) chmod 755 /localscratch$i esac if [ ! -h $i ] then if [ -n "$(ls -A $i)" ] then mv $i/* /localscratch$i fi # In theory, the directory should be empty because we moved everything # out of the way. But that may have failed if the localscratch directory # was already used. rm -rf $i ln -sf /localscratch$i $i fi shopt -u dotglob done mkdir -p /localscratch/ansys chmod 777 /localscratch/ansys On Mon, Nov 27, 2017 at 6:57 AM, Gilad Berman <gber...@lenovo.com<mailto:gber...@lenovo.com>> wrote: 1. I use local disk for scratch and swap. Somethings logs as well (in this case you can think of it as sort of statelite, but from xCAT perspective, it is still stateless). 2. I took only the part that not relate to statelite from the instructions – not working. Gilad Berman HPC Architect Lenovo EMEA +972-52-2554262<tel:+972%2052-255-4262> gber...@lenovo.com<mailto:gber...@leno
[xcat-user] makeconfluentcfg error
Jarrod, Seeing a strange error with U Roc. I got pulled into this last minute: makeconfluentcfg Error: confluent plugin bug, pid 31070, process description: 'xcatd SSL: makeconfluentcfg for root@localhost: confluent instance' with error 'Can't use string ("xcat") as an ARRAY ref while "strict refs" in use at /opt/xcat/lib/perl/xCAT_plugin/confluent.pm line 306. If I comment out the line for the node "xcat" in nodelist, it goes down a number of lines (to 176) in the nodelist file, then it complains about the next 6 lines. i gave up after that. I can run makeconfluentcfg for individual hosts (ie. bhw0001). Any thoughts? I've included the lines it complained about along with the preceding and following lines: "bh-eth07","switch",,, "xcat","__mgmtnode",,, "xcat1","__mgmtnode,ipmi",,, "bhw0103","rackC1,all,ward,ipmi,dx360","powering-off","06-30-2017 07:32:02","synced","04-03-2014 08:30:20",,, "kvm-host03","ipmi,x3550,kvm","booting","05-22-2017 15:39:08","synced","03-14-2014 12:42:30",,, "kvm-host04","ipmi,x3550,kvm","booting","05-22-2017 15:39:08","synced","03-14-2014 12:42:30",,, "kvm-host05","ipmi,x3550,kvm","booting","05-22-2017 15:39:08","synced","03-14-2014 12:42:30",,, "bhh0002","ipmi,hadoop,x3630","powering-off","05-22-2017 07:39:25","synced","03-21-2014 14:55:26",,, "bhh0003","ipmi,hadoop,x3630","powering-off","05-22-2017 07:39:25","synced","03-21-2014 14:55:26",,, "bhh0004","ipmi,hadoop,x3630","powering-off","05-22-2017 07:39:25","synced","03-21-2014 14:55:26",,, "bhh0005","ipmi,hadoop,x3550","powering-off","05-22-2017 07:39:25","synced","03-21-2014 14:55:26",,, "gss01","gss,gss_2.6.0,gssServer_2.6.0,bhgss","booting","05-22-2017 14:35:49","synced","08-31-2016 18:40:43",,, Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] makedns-generated zones and their NS record
I just had to deal with a similar issue. Try the following: - Fill out each network in the networks table, including domain name. Don’t worry about nameservers or gateway unless there is are external resources that should be used. NOTE defining an external nameserver in networks table will cause makedns to ignore any IPs in that subnet. - Define hpcmn-test as an unmanaged node in the cluster (nodeadd hpcmn-test groups=__Unmanaged or something similar) - Define its primary interface (the one whose domain matches site.domain value) in hosts.node/hosts.ip (this should also be what was deinfed in nodelist above) - Define all other interfaces in hosts.otherinterfaces with fqdn. For example: “hpcmn-test”,”1.2.3.4”,,”hpctest.compute.sabre.kkeane.sandiego.edu:2.3.4.5,hpcmn-test.imm.sabre.kkeane.sandiego.edu:3.4.5.6” The domain names listed for each IP in hosts should match the networks.domain entry for each respective subnet. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane [mailto:kke...@sandiego.edu] Sent: Tuesday, October 31, 2017 4:21 PM To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: [xcat-user] makedns-generated zones and their NS record I have a management node with three NICs, and want to use makedns to generate the DNS configuration. My management node has three names, corresponding to the three NICs: eth0: hpcmn-test.compute.sabre.kkeane.sandiego.edu<http://hpcmn-test.compute.sabre.kkeane.sandiego.edu> eth1: hpcmn-test.kkeane.sandiego.edu<http://hpcmn-test.kkeane.sandiego.edu> eth2: hpcmn-test.imm.sabre.kkeane.sandiego.edu<http://hpcmn-test.imm.sabre.kkeane.sandiego.edu> hostname -f returns hpcmn-test.kkeane.sandiego.edu<http://hpcmn-test.kkeane.sandiego.edu> (which is name by which my management node will be known on our public network). I have the DNS server listening only on eth2. Consequently, the zones in the DNS server should have the corresponding name server hpcmn-test.imm.sabre.kkeane.sandiego.edu<http://hpcmn-test.imm.sabre.kkeane.sandiego.edu>. However, the zones generated by makedns -n instead use the hpcmn-test.kkeane.sandiego.edu<http://hpcmn-test.kkeane.sandiego.edu> name. $TTL 86400 @ IN SOA hpcmn-test.kkeane.sandiego.edu<http://hpcmn-test.kkeane.sandiego.edu>. root.hpcmn-test.kkeane.sandiego.edu<http://root.hpcmn-test.kkeane.sandiego.edu>. ( 2017103100 10800 3600 604800 86400 ) IN NS hpcmn-test.kkeane.sandiego.edu<http://hpcmn-test.kkeane.sandiego.edu>. This wreaks havoc with future calls to makedns; updates will time out because the DNS server is not listening at the IP address that corresponds to this name (and in fact, makehosts doesn't even put this name into /etc/hosts) How can I get makedns to generate zones with an NS record that points to hpcmn-test.imm.sabre.kkeane.sandiego.edu<http://hpcmn-test.imm.sabre.kkeane.sandiego.edu> ? Thanks! -- ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859<tel:%28619%29%20260-2298> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Syncfiles addressinf synclist order
Trying to use multiple synclists for different node groups, and seeing some xdcp strangeness: xCAT Version 2.13.5.POST107.g8489de8 I have: lsdef -t osimage ces Object name: ces *snip* synclists=/install/templates/ces/ces.synclist,/install/templates/stg.synclist *snip* In each synclist is a line to push a file to /etc/sysconfig/network. The one called by stg.synclist is a default file all hosts get while the one called by ces.synclist is specific to this node group. I cannot get updatenode to push the correct file to the hosts. Regardless of the order they're listed in the osimage definition, the name of the file, or the atime on the file, "updatenode node -VF" shows xdcp reading ces.synclist first and stg.synclist second. If I only have ces.synclist defined in there, it pushes the correct file. Running command on xcat-bloom: ip -4 --oneline addr show |awk -F ' ' '{print $4}'|awk -F '/' '{print $1}' 2>&1 Running command on xcat-bloom: ip -4 --oneline addr show |awk -F ' ' '{print $4}'|awk -F '/' '{print $1}' 2>&1 Running command on xcat-bloom: chmod -R a+r /install/postscripts 2>&1 xcat-bloom: Internal call command: xdcp bces0 --nodestatus -F /install/templates/ces/ces.synclist Running internal xCAT command: xdcp ... Running command on xcat-bloom: ip -4 --oneline addr show |awk -F ' ' '{print $4}'|awk -F '/' '{print $1}' 2>&1 xcat-bloom: Internal call command: xdcp bces0 --nodestatus -F /install/templates/stg.synclist Running internal xCAT command: xdcp ... Running command on xcat-bloom: ip -4 --oneline addr show |awk -F ' ' '{print $4}'|awk -F '/' '{print $1}' 2>&1 File synchronization has completed for nodes. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] How to disable BMC discovery
1. use manual discovery for nodes. Not doing BMC discovery. Sounds llike you have a handle on this using the nodediscover command 2. set up a node with nodeadd command, setup it's MAC address. Update DNS, DHCP, to complete setup. nodeadd would be done before anything else. After that, you can define an IP for the node either in the hosts table (followed by running makenosts) or by just manually editing /etc/hosts. Finally, you can run makedns to include the node in the cluster DNS. 3. turn the node on, do a manual discovery, add it's UUID with a node name The regular nodediscover process should get this info and populate the appropriate tables. Shouldn’t matter if you’re doing manual or switch-based discovery (which you can still do w/o doing bmcsetup) 4. let xcat assign an IP address to it and let xcat setup that IP as static IP in the node. like the ifcfg-eth0 file example above. This goes back to defining an IP for the node. Once that is done, you can use the postscript "hardeths" or "confignics -s" to set up the provisioning nic as static during postinstall. I believe the difference between the two is that hardeths will only set up the provisioning nic whereas confignics will set up any nics defined in the nics table, and will additionally set up the provisioning nic if it's passed the "-s" option. The provisioning nic does not need to be defined in the nics table. Both can work either as a postscript or postbootscript. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Imam Toufique [mailto:techie...@gmail.com] Sent: Thursday, October 26, 2017 12:04 AM To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: Re: [xcat-user] How to disable BMC discovery Zhao, Thank you very much for your reply, I deleted the dhcpd.leases file and did a 'nodeset n01 osimage' and then I was able to image the node. I do understand now that I was not pushing the node to an 'install' state with the 'nodeset n02 osimage' command, hence the node was not building at all. Christian, Hi, thanks to you as well for helping out in a great deal :-), very much appreciated! Yes, I use manual discovery method where xcat would expect a MAC addr. from a node and move on with imaging the node. But, I can't do 'nodeset n02 osimage' until I define an IP address for the node. I would like the IP address to auto-assigned for the node and then setup /etc/sysconfig/network-scripts/ifcfg-eth0 file with a static IP setup after the node is built. for example, see below, after a node is imaged with an auto-assigned IP address , I would like the IP to be set up in the ifcfg-eth0 file as below (an example): TYPE="Ethernet" PROXY_METHOD="none" BROWSER_ONLY="no" BOOTPROTO="none" <-- this is no longer set to 'dhcp' DEFROUTE="yes" IPV4_FAILURE_FATAL="no" IPV6INIT="yes" NAME="em1" UUID="0909715c-4eee-4a8f-b4aa-2062cd5cae94" DEVICE="em1" ONBOOT="yes" ETHTOOL_OPTS="wol d" IPADDR="10.1.1.199" . <-- IP address set in here PREFIX="24" GATEWAY="10.1.1.1" <-- GW set here DNS1="8.8.8.8" Sorry about long rant here, let me try to summarize what I would like to do: Is that above possible? Of course, I can modify the file with a post-install script, but if xcat can do it within its own engine, then why write an unnecessary script for it(?) Thanks, guys, I couldn't have come this far with xcat without your help :-) --imam On Wed, Oct 25, 2017 at 6:29 AM, Christian Caruthers <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> wrote: Building on this, if you are not running discovery for any nodes in this cluster, you can remove or disable the following lines in the chain table: "ipmi",,,"runcmd=bmcsetup,shell","nodediscover",, "blade",,,"standby","nodediscover",, Adding a “1” to the end of each line will disable it. I’m not sure if that will stop xCAT from attempting to discover nodes, or if there’s a way to just have any unknown nodes boot to the genesis shell. Possibly an entry like: “compute”,,,”shell”,”nodediscover”,, You could also possibly do manual discovery using the nodediscover command. I have not used this method, but my understanding is that you basically tell xCAT to expect a PXE request from n02, so it will assume the next MAC that pings the DHCP server is n02. Again, if chain is not configured to run bmcsetup (a la my example above), it should just run nodediscover and drop you to the genesis shell prompt on the node. The chain table “Controls what operations are done (and it what order) when a node is discovered and deployed.” See the “man chain” for details, but this is one of the areas that dictates how the DHCP server responds to PXE requests. You contr
Re: [xcat-user] Same short name causing makehosts problems
Oddly, when I set up example 2, I received the following error when running makedns –n: Error: Failure encountered updating bloom.geode.iu.edu., error was NOTZONE. See more details in system log. Error: Failure encountered updating public.geode.iu.edu., error was NOTZONE. See more details in system log. Looking in the syslog, I see: Oct 25 09:13:30 xcat-bloom named[26918]: client 172.20.0.2#52572/key xcat_key: updating zone 'bloom.geode.iu.edu/IN': update failed: update RR is outside zone (NOTZONE) Oct 25 09:13:35 xcat-bloom named[26918]: client 172.20.0.2#60694/key xcat_key: updating zone 'public.geode.iu.edu/IN': update failed: update RR is outside zone (NOTZONE) Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Yuan Y Bai [mailto:by...@cn.ibm.com] Sent: Monday, October 23, 2017 2:48 AM To: xcat-user@lists.sourceforge.net Cc: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] Same short name causing makehosts problems Hi Christian, If you only want to "external" networks into "/etc/hosts", you can only all it into hosts table like exemaple 1). But I noticed that your want to configure "bond0" and "ens2f1" in nics table. so I modified your nics table and hosts table a little and get the another results which near your requirements and make sure `nics` table configuration correct. The different is : 2 “external” networks in favor of “farnsworth.domain.name” are added into /etc/hosts, but it has short name with "iface". like example 2) example 1) only all "external" networks into /etc/hosts: ]# cat /etc/hosts 172.20.0.11 node1 node1.cluster.local 172.29.0.11 node1-imm node1-imm.cluster.local 10.0.0.38 farnsworth farnsworth.dc.foo.edu 148.143.201.25 farnsworth farnsworth.it.raboof.edu 172.40.0.11 node1-ib node1-ib.cluster.local ]# tabdump hosts #node,ip,hostnames,otherinterfaces,comments,disable "node1","172.20.0.11",,"-imm:172.29.0.11,farnsworth.dc.foo.edu:10.0.0.38,farnsworth.it.raboof.edu:148.143.201.25",, ]# tabdump nics #node,nicips,nichostnamesuffixes,nichostnameprefixes,nictypes,niccustomscripts,nicnetworks,nicaliases,nicextraparams,nicdevices,nicsadapter,comments,disable "node1","ib0!172.40.0.11","ib0!-ib",,"ib0!Infiniband,bond0!Ethernet,ens2f1!Ethernet","bond0!configbond-mtu bond0 ens1@ens1d1 mode=4@xmit_hash_policy=layer3+4@miimon=100","ib0!IPoIB,bond0!intersite,ens2f1!public",,,"bond0!MTU=9000,ens2f1!MTU=9000",,, exmaple 2) configure bond0 in nics table example: ]# cat /etc/hosts 127.0.0.1 localhost 172.20.0.11 node1 node1.cluster.local 172.29.0.11 node1-imm node1-imm.cluster.local 148.143.201.25 node1-ens2f1 farnsworth.it.raboof.edu 172.40.0.11 node1-ib node1-ib.cluster.local 10.0.0.38 node1-bond0 farnsworth.dc.foo.edu ]# lsdef node1 |grep nic nicaliases.bond0=farnsworth.dc.foo.edu nicaliases.ens2f1=farnsworth.it.raboof.edu niccustomscripts.bond0=configbond-mtu bond0 ens1@ens1d1 mode=4@xmit_hash_policy=layer3+4@miimon=100 nicdevices.bond0=MTU=9000 nicdevices.ens2f1=MTU=9000 nichostnamesuffixes.ib0=-ib nicips.ib0=172.40.0.11 nicips.bond0=10.0.0.38 nicips.ens2f1=148.143.201.25 nicnetworks.ib0=IPoIB nicnetworks.bond0=intersite nicnetworks.ens2f1=public nictypes.ib0=Infiniband nictypes.bond0=Ethernet nictypes.ens2f1=Ethernet ]# tabdump hosts #node,ip,hostnames,otherinterfaces,comments,disable "node1","172.20.0.11",,"-imm:172.29.0.11",, Best Regards -- Yuan Bai (白媛) CSTL HPC System Management Development Tel:86-10-82451401 E-mail: by...@cn.ibm.com<mailto:by...@cn.ibm.com> Address: IBM ZGC Campus. Ring Building 28, ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District, Beijing P.R.China 100193 IBM环宇大厦 北京市海淀区东北旺西路8号,中关村软件园28号楼 邮编:100193 - Original message - From: Christian Caruthers <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> To: "xCAT Users Mailing list (xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>)" <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Cc: Subject: [xcat-user] Same short name causing makehosts problems Date: Sat, Oct 21, 2017 2:46 AM Running version 2.13.5.POST107.g8489de8 I have a handful on nodes that use the same short name on 2 networks. Here’s what it should look like: node1.cluster.local 172.20.0.11 node1-imm.cluster.local 172.29.0.11 node1-ib.cluster.local 172.40.0.11 farnsworth.dc.foo.edu 10.0.0.38 farnsworth.it.raboof.edu 148.143.201.25 Networks setup is: #netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,di
Re: [xcat-user] How to disable BMC discovery
Building on this, if you are not running discovery for any nodes in this cluster, you can remove or disable the following lines in the chain table: "ipmi",,,"runcmd=bmcsetup,shell","nodediscover",, "blade",,,"standby","nodediscover",, Adding a “1” to the end of each line will disable it. I’m not sure if that will stop xCAT from attempting to discover nodes, or if there’s a way to just have any unknown nodes boot to the genesis shell. Possibly an entry like: “compute”,,,”shell”,”nodediscover”,, You could also possibly do manual discovery using the nodediscover command. I have not used this method, but my understanding is that you basically tell xCAT to expect a PXE request from n02, so it will assume the next MAC that pings the DHCP server is n02. Again, if chain is not configured to run bmcsetup (a la my example above), it should just run nodediscover and drop you to the genesis shell prompt on the node. The chain table “Controls what operations are done (and it what order) when a node is discovered and deployed.” See the “man chain” for details, but this is one of the areas that dictates how the DHCP server responds to PXE requests. You control what is served to PXE clients using the nodeset command. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Er Tao Zhao [mailto:erta...@cn.ibm.com] Sent: Wednesday, October 25, 2017 3:58 AM To: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] How to disable BMC discovery Hi, Imam Have you assign IP address for n02? Pls follow the steps below to install OS. 1. chdef n02 ip=x.x.x.x 2. makehosts n02 makedns n02 ==> To generate nodename/IP mapping. 3. nodeset n02 osimage=centos6.9-x86_64-install-compute ==> 1. generate the IP/Mac of n02 mapping(DHCP lease entry) internally 2. generate boot configuration file 3. generate kickstart configuration file. Then, pls reboot your compute node. Feel free to let me know if there is any more issue. Thx! Best Regards, --- Zhao Er Tao IBM China System and Technology Laboratory, Beijing Tel:(86-10)82450485 Email: erta...@cn.ibm.com<mailto:erta...@cn.ibm.com> Address: 1/F, 28 Building,ZhongGuanCun Software Park, No.8 DongBeiWang West Road, Haidian District, Beijing, 100193, P.R.China - Original message - From: Imam Toufique <techie...@gmail.com<mailto:techie...@gmail.com>> To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Cc: Subject: Re: [xcat-user] How to disable BMC discovery Date: Wed, Oct 25, 2017 12:25 PM Hi Christian (and everyone in this list), My apology for the long post here. the dump on the chain table is: [root@xcatmaster xcat]# tabdump chain #node,currstate,currchain,chain,ondiscover,comments,disable "ipmi",,,"runcmd=bmcsetup,shell","nodediscover",, "blade",,,"standby","nodediscover",, "n03",, "n01","install centos6.9-x86_64-compute","boot" what is the chain table do? So, I noticed that when I add a node and i power it up (these nodes have no BMC ), get the following output in the screen: Oct 24 10:17:41 10.1.1.63 (none) xcat.genesis.dodiscovery: Couldn't find MTM information in FRU, falling back to DMI (MTMS-based discovery may fail) Oct 24 10:17:41 10.1.1.63 (none) xcat.genesis.dodiscovery: Beginning echo information to discovery packet file... Oct 24 10:17:42 10.1.1.63 (none) xcat.genesis.dodiscovery: Discovery packet file is ready. Oct 24 10:17:42 10.1.1.63 (none) xcat.genesis.dodiscovery: Sending the discovery packet to xCAT (10.1.1.20:3001)... Oct 24 10:17:42 10.1.1.63 (none) xcat.genesis.dodiscovery: Sleeping 5 seconds... Then, when I do nodediscoverls , i get this: [root@xcatmaster xcat]# nodediscoverls -t undef UUIDNODEMETHOD MTMSERIAL 4C4C4544-0059-3410-8048-C3C04F4D4E31undef undef Dell Inc:OptiPlex 980 CY4HMN1 then, when I define the node, with: > nodediscoverdef -u -n then I see in the log: Oct 24 18:50:37 n06 xcat.genesis.doxcat: Received request=standby, will call xCAT back in 30 seconds. Discovery is complete, run nodeset on this node to provision an Operating System what nodeset command am I supposed to run for the node? is it, 'nodeset osimage? I have another issue about renaming a node. So, I had a node built with node name 'n02'. and I wanted to see how xcat worked for a node name changeover. So, here is my scenario. node 'n2' is renamed with 'n6'
Re: [xcat-user] How to disable BMC discovery
What is the output of ‘nodeset n02 stat’? Also, what is the contents of the chain table (tabdump chain)? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Imam Toufique [mailto:techie...@gmail.com] Sent: Tuesday, October 24, 2017 1:28 PM To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net> Subject: [xcat-user] How to disable BMC discovery Hello everyone, We are trying to test xcat out in our cluster and I need some help. Here is the issue I have: I have followed the following document to setup the environment: https://sourceforge.net/p/xcat/wiki/XCAT_iDataPlex_Cluster_Quick_Start/ I added my first node manually, as I did not want auto discovery. Here is my [root@xcatmaster xcat]# lsdef n02 Object name: n02 arch=x86_64 groups=compute installnic=mac mac=00:24:E8:36:AD:95 netboot=xnba os=centos6.9 postbootscripts=otherpkgs postscripts=syslog,remoteshell,syncfiles primarynic=mac profile=compute provmethod=centos6.9-x86_64-install-compute The node above does not have BMC, it is a standard PC. when I power on the node, i see the following messages in syslog: Oct 24 10:16:20 10.1.1.61 (none) xcat.genesis.dodiscovery: Beginning echo information to discovery packet file... Oct 24 10:16:20 10.1.1.61 (none) xcat.genesis.dodiscovery: Discovery packet file is ready. Oct 24 10:16:20 10.1.1.61 (none) xcat.genesis.dodiscovery: Sending the discovery packet to xCAT (10.1.1.20:3001)... Oct 24 10:16:20 10.1.1.61 (none) xcat.genesis.dodiscovery: Sleeping 5 seconds... Oct 24 10:16:21 10.1.1.63 (none) xcat.genesis.dodiscovery: Couldn't find MTM information in FRU, falling back to DMI (MTMS-based discovery may fail) Oct 24 10:16:21 10.1.1.63 (none) xcat.genesis.dodiscovery: Beginning echo information to discovery packet file... ^C [root@xcatmaster xcat]# tail -f computes.log Oct 24 10:17:41 10.1.1.63 (none) xcat.genesis.dodiscovery: Couldn't find MTM information in FRU, falling back to DMI (MTMS-based discovery may fail) Oct 24 10:17:41 10.1.1.63 (none) xcat.genesis.dodiscovery: Beginning echo information to discovery packet file... Oct 24 10:17:42 10.1.1.63 (none) xcat.genesis.dodiscovery: Discovery packet file is ready. Oct 24 10:17:42 10.1.1.63 (none) xcat.genesis.dodiscovery: Sending the discovery packet to xCAT (10.1.1.20:3001)... Oct 24 10:17:42 10.1.1.63 (none) xcat.genesis.dodiscovery: Sleeping 5 seconds... I think it is trying to look for BMC and not finding it, therefore, it just keep looking for it? I am no expert in xcat, but is there a way to disable BMC discovery in xcat for nodes that do not have BMC? Please help, I need to get this thing up and going. Thanks a lot! -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Same short name causing makehosts problems
Running version 2.13.5.POST107.g8489de8 I have a handful on nodes that use the same short name on 2 networks. Here's what it should look like: node1.cluster.local 172.20.0.11 node1-imm.cluster.local 172.29.0.11 node1-ib.cluster.local 172.40.0.11 farnsworth.dc.foo.edu 10.0.0.38 farnsworth.it.raboof.edu 148.143.201.25 Networks setup is: #netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable "provision","172.20.0.0","255.255.255.0","ens3",,,"172.20.0.2",,"172.20.0.1",,"172.20.0.200-172.20.0.250", "imm","172.29.0.0","255.255.255.0","ens3",,,"172.29.0.2", "infra","172.30.0.0","255.255.255.0","ens3",,,"172.30.0.2", "IPoIB","172.40.0.0","255.255.255.0","ens3",,,"172.40.0.2", "intersite","10.0.0.0","255.255.255.0","ens8","dc.foo.edu",, , "public","148.143.201.0","255.255.255.224",,"149.165.232.1""it.raboof.edu",,, If I set up the provisioning and IMM IPs in the hosts table w/ everything else defined in nics, I see: Node setup: Nics table: "node1","ib0!172.40.0.11,bond0!10.0.0.38,ens2f1!148.143.201.25","ib0!-ib",,"ib0!Infiniband,bond0!Ethernet,ens2f1!Ethernet","bond0!configbond-mtu bond0 ens1@ens1d1 mode=4@xmit_hash_policy=layer3+4@miimon=100","ib0!IPoIB,bond0!intersite,ens2f1!public","bond0!farnsworth,ens2f1!farnsworth","bond0!MTU=9000,ens2f1!MTU=9000" Hosts table: "node1","172.20.0.11",,"-imm:172.29.0.11",, After running makehosts -n, I see: 172.20.0.17 node1 node1.cluster.local 172.29.0.17 node1-imm node1-imm.cluster.local 148.143.201.25 node1-ens2f1 node1-ens2f1.it.raboof.edu farnsworth 172.40.0.17 node1-ib node1-ib.cluster.local 10.0.0.38 node1-bond0 node1-bond0.dc.foo.edu Note that it creates an alias for farnsworth (which won't work for our purposes), but ignores the setting for the 10.0.0.0/bond0 net/interface. In order to get rid of the alias and just enter the node in hosts as "farnsworth.dc.foo.edu" or "farnsworth.it.raboof.edu," I can enter them in the hosts.otherinterfaces field for the node, but again it will only enter 1 in the hosts file correctly. The other will have the "-iface" name and maybe an alias if it's defined in the nics table (which won't work for us). I've also tried adding the external name as an unmanaged node and creating entries in the hosts table like: "node1","172.20.0.11",,"-imm:172.29.0.11,farnsworth: 148.143.201.25",, "farnsworth"," 10.0.0.38",,,"node1 inter-site link", This leaves me with: 172.20.0.17 node1 node1.cluster.local 172.29.0.17 node1-imm node1-imm.cluster.local 148.143.201.25 farnsworth farnsworth.it.raboof.edu 172.40.0.17 node1-ib node1-ib.cluster.local 10.0.0.38 node1-bond0 node1-bond0.dc.foo.edu The goal is to do away with the "-iface" host name on the 2 "external" networks in favor of just "farnsworth.domain.name" So far, I can only get one to work correctly. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Recommendations for a lab environment
I haven't used that feature, but it's documented here: https://sourceforge.net/p/xcat/wiki/Managing_Ethernet_Switches/ Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane [mailto:kke...@sandiego.edu] Sent: Tuesday, August 29, 2017 6:43 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] Recommendations for a lab environment Thank you so much for those suggestions! Are there any special requirements for the switches? My understanding that xcat manages switches as well as nodes. On Sat, Aug 26, 2017 at 6:54 AM, Christian Caruthers <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> wrote: My lab cluster is comprised of old iDataPlex systems, a few 1Gb switches, and some old fibre connected storage. None of this is hardware I'll come across at a customer site. A few systems have old BMC management processors instead of IMMs. Here are some things I like to have in the lab: Systems w/ management controllers. Being able to power cycle and see console remotely is a must. So long as my software can work with it, it'll do. Multiple networks with ISLs between switches. Maybe I want to test or update a bonding script. Maybe I have a project that requires nodes be on 2 networks/VLANs. Storage. You don't use GPFS, but shared storage is still a part of your cluster. From a performance perspective, it would be nice to replicate the network you have since not all performance-related configurations scale linearly. Still, understanding what works on a 1Gb network will at least give you insight into what might work in a 10Gb network. For my lab system, the goal is flexibility. I want to be able to configure it multiple ways to address multiple scenarios (preferrably w/o having to drive 3.5 hours to Morrisville). I don't really care about the hardware - it's really old and lab performance is not going to match production. In your case, I think flexibility would still be valuable, but you have a production environment for which you want a test bed. Using what you have, replicate the systems in your production environment. You have a management node, some compute nodes, and an NFS storage node. Start with that. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872<tel:(757)%20289-9872> From: Kevin Keane [mailto:kke...@sandiego.edu<mailto:kke...@sandiego.edu>] Sent: Friday, August 25, 2017 1:43 PM To: xCAT Users Mailing list Subject: [xcat-user] Recommendations for a lab environment We currently have a production cluster managed by xCAT - Lenovo-based, 16 compute nodes, management node, storage node. Each node has dual 1 Gb NICs, plus two 10 Gb NICs used as interconnect. All Intel CPUs. What I am looking for is to set up a lab environment "on the cheap" using old retired hardware (or possibly a virtual environment). I can get my hand on retired HP or Dell servers easily, and probably also on some networking hardware (although not 10 Gb). By necessity, this cluster can't be a clone of the production environment. It will have fewer cores per node, fewer nodes, less memory, different networking setup. The goal is four-fold: - Learn more about xCAT without breaking the production cluster. - Test improvements for the production cluster. - Test and dry-run future major future software upgrades (such as, from RedHat 6.7 to 7.4). - Test installing additional application software for end-users before it goes on the production cluster. What I'm looking for is recommendations on what to pay attention to in order to make this "playground" as useful as possible, within the constraints. Thanks! -- ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859<tel:%28619%29%20260-2298> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859<tel:%28619%29%20260-2298> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Recommendations for a lab environment
My lab cluster is comprised of old iDataPlex systems, a few 1Gb switches, and some old fibre connected storage. None of this is hardware I'll come across at a customer site. A few systems have old BMC management processors instead of IMMs. Here are some things I like to have in the lab: Systems w/ management controllers. Being able to power cycle and see console remotely is a must. So long as my software can work with it, it'll do. Multiple networks with ISLs between switches. Maybe I want to test or update a bonding script. Maybe I have a project that requires nodes be on 2 networks/VLANs. Storage. You don't use GPFS, but shared storage is still a part of your cluster. From a performance perspective, it would be nice to replicate the network you have since not all performance-related configurations scale linearly. Still, understanding what works on a 1Gb network will at least give you insight into what might work in a 10Gb network. For my lab system, the goal is flexibility. I want to be able to configure it multiple ways to address multiple scenarios (preferrably w/o having to drive 3.5 hours to Morrisville). I don't really care about the hardware - it's really old and lab performance is not going to match production. In your case, I think flexibility would still be valuable, but you have a production environment for which you want a test bed. Using what you have, replicate the systems in your production environment. You have a management node, some compute nodes, and an NFS storage node. Start with that. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Kevin Keane [mailto:kke...@sandiego.edu] Sent: Friday, August 25, 2017 1:43 PM To: xCAT Users Mailing list Subject: [xcat-user] Recommendations for a lab environment We currently have a production cluster managed by xCAT - Lenovo-based, 16 compute nodes, management node, storage node. Each node has dual 1 Gb NICs, plus two 10 Gb NICs used as interconnect. All Intel CPUs. What I am looking for is to set up a lab environment "on the cheap" using old retired hardware (or possibly a virtual environment). I can get my hand on retired HP or Dell servers easily, and probably also on some networking hardware (although not 10 Gb). By necessity, this cluster can't be a clone of the production environment. It will have fewer cores per node, fewer nodes, less memory, different networking setup. The goal is four-fold: - Learn more about xCAT without breaking the production cluster. - Test improvements for the production cluster. - Test and dry-run future major future software upgrades (such as, from RedHat 6.7 to 7.4). - Test installing additional application software for end-users before it goes on the production cluster. What I'm looking for is recommendations on what to pay attention to in order to make this "playground" as useful as possible, within the constraints. Thanks! -- ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859<tel:%28619%29%20260-2298> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Setting user hostnames
I agree, I don't think it is supported. I think I can make it work using the solution I outlined, but that doesn't appear to be working according to the documentation. Looking for verification of what I'm trying to do, confirmation that the envlist files are still parsed (it appears they are), and/or suggestions. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 Sent from my mobile device On Mon, Aug 21, 2017 at 5:58 PM -0500, "Kevin Keane" <kke...@sandiego.edu<mailto:kke...@sandiego.edu>> wrote: The way I read it, this was a design suggestion for a future version of xCAT; it is described as a low-priority item. It may not yet be implemented. On Mon, Aug 21, 2017 at 3:42 PM, Christian Caruthers <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> wrote: Using this as a reference: https://sourceforge.net/p/xcat/wiki/Exporting_more_table_attributes_to_nodes/ Currently, we have n1…n10 configured in xCAT. These are internal hostnames. All of these hosts have external "public" IP addresses, and the desire is to have the hosname match the public name. Short of setting up DNS for the public IPs, which I really don't want to do since there will be an external DNS server w/ that info, I thought I could use an unused table.field (prodkey.key) for a .envlist file. So I have: ls -la /install/custom/install/rh/x86_84/chum.envlist prodkey,key,,NEWHOST (NOTE: I have also tried this in /install/custom/install/rh/) lsdef n1 | egrep 'profile|post|productkey' postbootscripts=,chumhostname productkey=rabidgator1.floridaman.arrr.edu<http://rabidgator1.floridaman.arrr.edu> profile=chum cat /install/postscripts/chumhostname hostnamectl set-hostname $NEWHOST When I try this with updatenode, it doesn't appear to be grabbing the info from that envlist file. Is this feature still in use? If not, what's a recommended way to set a different hostname? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872<tel:(757)%20289-9872> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.edu<mailto:kke...@sandiego.edu> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859<tel:%28619%29%20260-2298> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Setting user hostnames
Using this as a reference: https://sourceforge.net/p/xcat/wiki/Exporting_more_table_attributes_to_nodes/ Currently, we have n1...n10 configured in xCAT. These are internal hostnames. All of these hosts have external "public" IP addresses, and the desire is to have the hosname match the public name. Short of setting up DNS for the public IPs, which I really don't want to do since there will be an external DNS server w/ that info, I thought I could use an unused table.field (prodkey.key) for a .envlist file. So I have: ls -la /install/custom/install/rh/x86_84/chum.envlist prodkey,key,,NEWHOST (NOTE: I have also tried this in /install/custom/install/rh/) lsdef n1 | egrep 'profile|post|productkey' postbootscripts=,chumhostname productkey=rabidgator1.floridaman.arrr.edu profile=chum cat /install/postscripts/chumhostname hostnamectl set-hostname $NEWHOST When I try this with updatenode, it doesn't appear to be grabbing the info from that envlist file. Is this feature still in use? If not, what's a recommended way to set a different hostname? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] using xCAT to view "Active Events" for Lenovo System x servers
Have you looked at 'rvitals mynode leds' ? Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 From: Rundall, Jacob D [mailto:rund...@illinois.edu] Sent: Wednesday, May 17, 2017 3:13 PM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] using xCAT to view "Active Events" for Lenovo System x servers I’m curious if anybody can help me figure out how to use xCAT to view “Active Events” for Lenovo System x servers, as shown in the web interface of the IMM. Using pasu gets me somewhere, as follows: pasu mynode immapp showimmlog | grep “Severity:5” There are a few shortcomings, though, as compared to the web interface of the IMM: 1. pasu shows me past events that are no longer active (and the recovery events are lower severity so they don’t make it through the grep, so it’s not obvious that the events have been recovered from, at least not with this command). 2. pasu only returns items with some kind of sequence number rather than a date and time. 3. The web interface also sometimes has “Additional Information for Event” as well, which I cannot figure out how to view using pasu. Here is an example of what I can see in the IMM web interface: Error System 25 June 2016, 03:14:40.788 AM An Uncorrectable Error has occurred on PCIs. Error System 25 June 2016, 03:15:13.638 AM Fault in slot 3 on system System x3650 M5. Clicking “more” on the latter provides the following additional information: [S.68005] An error has been detected by the the IIO core logic on CPU 1. The Global Fatal Error Status register contains 0x0. The Global Non-Fatal Error Status register contains 0x40. Please check error logs for the presence of additional downstream device error data. And here’s the output that I get using my pasu command shown above (with grep): monitor01: 19 | Severity:5 | Message:Redundancy Lost for Power Unit has asserted. monitor01: 22 | Severity:5 | Message:Redundancy Lost for Power Unit has asserted. monitor01: 27 | Severity:5 | Message:Redundancy Lost for Power Unit has asserted. monitor01: 49 | Severity:5 | Message:Redundancy Lost for Power Unit has asserted. monitor01: 56 | Severity:5 | Message:Redundancy Lost for Power Unit has asserted. monitor01: 125 | Severity:5 | Message:A Fatal Bus Error has occurred on bus CPU 2 PECI. monitor01: 126 | Severity:5 | Message:An Uncorrectable Error has occurred on PCIs. monitor01: 128 | Severity:5 | Message:Fault in slot 3 on system System x3650 M5. monitor01: 138 | Severity:5 | Message:A Fatal Bus Error has occurred on bus CPU 2 PECI. monitor01: 164 | Severity:5 | Message:A Fatal Bus Error has occurred on bus CPU 2 PECI. Events 126 and 128 clearly correspond to what is shown as “Active Events” in the web interface. But it’s not obvious that the others are not active unless I dig deeper in the IMM log (e.g., without filtering through grep). When I do that I can eventually find subsequent recovery events for the other sev 5 events which shows why they are not considered “active”. On a related note, does anyone know of a way with xCAT (pasu or otherwise) to view status/info about the following via the command-line from an xCAT management node: 1. IMM web interface: System Status -> System Information -> Check Log LED [I suspect the status here corresponds to the status of the “Check log LED” on the front of the server]. 2. Front of the server: “System-error LED” 3. IMM web interface: System Status -> Hardware Health: status of each component type (i.e., “Cooling Devices”, “Power Modules”, “Local Storage”, “Processors”, “Memory”, “System”) Thanks very much, Jake Rundall -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] configbond script
I have the following setup: nodels gss1 nics gss1: nics.nicips: bond0!10.100.10.20 gss1: nics.nicextraparams: bond0!MTU=9000 gss1: nics.nicsadapter: bond0!MTU=9000 gss1: nics.nichostnamesuffixes: bond0!-10g gss1: nics.nicdevices: bond0!enp134s0,enp134s0d1,enp27s0,enp27s0d1,enp32s0,enp32s0d1 gss1: nics.node: gss1 gss1: nics.niccustomscripts: bond0!configbond bond0 enp134s0@enp134s0d1@enp27s0@enp27s0d1@enp32s0@ enp32s0d1 miimon=100@mode=4@xmit_hash_policy=1 gss1: nics.nicnetworks: bond0!10G gss1: nics.nictypes: bond0!Ethernet Despite this, the MTU setting isn't being placed in the ifcfg-bond script. Looking at the nics manpage, it's not really clear which setting does which: nicextraparams Comma-separated list of extra parameters that will be used for each NIC configuration. nicsadapter Comma-separated list of extra parameters that will be used for each NIC configuration. Regards, Christian Caruthers Lenovo Professional Services Mobile: 757-289-9872 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] OFED Installation Problem
If you enable debugging on your postscript and run it using updatenode -P does it work? Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Hakan Bayındır [mailto:hakan.bayin...@tubitak.gov.tr] Sent: Wednesday, March 29, 2017 8:49 AM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] OFED Installation Problem Hello, I'm advancing in my installation and tests, however I'm having a problem with installing Mellanox's OFED distribution. I've followed the instructions on xCAT's documentation here. When the installation completes I get the error "Can't open perl script "mlnxofedinstall": No such file or directory". I know that all dependencies are installed on the system. When I copy the distribution by hand and run mlnxofedinstall --force, everything is installed as it should. I'm running CentOS 7.3, x86_64, with latest xCAT release. Best regards, Hakan Bayindir P.S.: Resent due to possible dropping of message due to GPG signing. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1
You don't have to build the stateless image on the management node. You can use a service node or some other surrogate running the same OS, but as far as I know you can't build a RHEL/CentOS image on a Debian/Ubuntu host. If you don't use stateful installs rather than stateless images, you can have as heterogenous an environment as you like. Stateless images are somewhat limiting. Christian Caruthers Lenovo Professional Services 757-289-9872 Sent from my mobile device On Mon, Mar 6, 2017 at 10:31 AM -0500, "Nora D" <uniqu...@live.com<mailto:uniqu...@live.com>> wrote: What if I have one management node and several slave nodes and I want to deploy each slave node with different OS images. How can I do that? On Mar 6, 2017, at 6:04 PM, Christian Caruthers <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> wrote: You would need to either build your VMs to use Ubuntu or reinstall your xcat management node, ideally using centos 7.3. It's a good practice to use the same OS on the stateless images as is on the xCAT management node. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Nora D [mailto:uniqu...@live.com] Sent: Saturday, March 4, 2017 11:12 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 When I generate the image using the following command: genimage centos7.3-x86_64-netboot-compute I get the following: Generating image: cd /opt/xcat/share/xcat/netboot/centos; ./genimage -a x86_64 -o centos7.3 -p compute --srcdir "/install/centos7.3/x86_64" --pkglist /opt/xcat/share/xcat/netboot/centos/compute.centos7.pkglist --otherpkgdir "/install/post/otherpkgs/centos7.3/x86_64" --postinstall /opt/xcat/share/xcat/netboot/centos/compute.centos7.postinstall --rootimgdir /install/netboot/centos7.3/x86_64/compute --tempfile /tmp/xcat_genimage.4583 centos7.3-x86_64-netboot-compute 100 blocks /opt/xcat/share/xcat/netboot/centos 100 blocks /opt/xcat/share/xcat/netboot/centos yum -y -c /tmp/genimage.4590.yum.conf --installroot=/install/netboot/centos7.3/x86_64/compute/rootimg/ --disablerepo=* --enablerepo=centos7.3-x86_64-0 install bash dracut-network nfs-utils openssl dhclient kernel openssh-server openssh-clients iputils bc irqbalance procps-ng wget vim-minimal ntp rpm rsync rsyslog e2fsprogs parted net-tools gzip tar xz sh: 1: yum: not found yum invocation failed Error: Command failed: cd /opt/xcat/share/xcat/netboot/centos; ./genimage -a x86_64 -o centos7.3 -p compute --srcdir "/install/centos7.3/x86_64" --pkglist /opt/xcat/share/xcat/netboot/centos/compute.centos7.pkglist --otherpkgdir "/install/post/otherpkgs/centos7.3/x86_64" --postinstall /opt/xcat/share/xcat/netboot/centos/compute.centos7.postinstall --rootimgdir /install/netboot/centos7.3/x86_64/compute --tempfile /tmp/xcat_genimage.4583 centos7.3-x86_64-netboot-compute 2>&1. Error message: 100 blocks /opt/xcat/share/xcat/netboot/centos 100 blocks /opt/xcat/share/xcat/netboot/centos yum -y -c /tmp/genimage.4590.yum.conf --installroot=/install/netboot/centos7.3/x86_64/compute/rootimg/ --disablerepo=* --enablerepo=centos7.3-x86_64-0 install bash dracut-network nfs-utils openssl dhclient kernel openssh-server openssh-clients iputils bc irqbalance procps-ng wget vim-minimal ntp rpm rsync rsyslog e2fsprogs parted net-tools gzip tar xz sh: 1: yum: not found yum invocation failed. I think because yum is used with redhat not Ubuntu. How can I work around this? From: Nora D <uniqu...@live.com<mailto:uniqu...@live.com>> Sent: Saturday, March 4, 2017 3:44 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 Great! I will check the link. Thank you for your help. Sent from my iPhone On Mar 4, 2017, at 6:41 PM, Jarrod Johnson <jjohns...@lenovo.com<mailto:jjohns...@lenovo.com>> wrote: Did you want to do stateless? If so: https://sourceforge.net/p/xcat/wiki/Build_and_Boot_Stateless_Images/ It’s probably easiest to let it make the VMs for you. It is possible to use existing VMs with some work, but based on your situation, I think it’s probably easier to go with the new vm. From: Nora D [mailto:uniqu...@live.com] Sent: Saturday, March 04, 2017 10:24 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 Thank you so much you saved my life!!! I really appreciate it. I noticed that it creates a new VM I thought it will power off a VM that I created myself. Now the rpower on/off works with no error. However, I am aiming to achieve pxe booting on the vm. Can you help me with that please? When I execute the following command: nodeset vm2 osimage=centos7.3-x86_64-netboot-compute I get the following errors: Error: Did you run "gen
Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1
You would need to either build your VMs to use Ubuntu or reinstall your xcat management node, ideally using centos 7.3. It's a good practice to use the same OS on the stateless images as is on the xCAT management node. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Nora D [mailto:uniqu...@live.com] Sent: Saturday, March 4, 2017 11:12 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 When I generate the image using the following command: genimage centos7.3-x86_64-netboot-compute I get the following: Generating image: cd /opt/xcat/share/xcat/netboot/centos; ./genimage -a x86_64 -o centos7.3 -p compute --srcdir "/install/centos7.3/x86_64" --pkglist /opt/xcat/share/xcat/netboot/centos/compute.centos7.pkglist --otherpkgdir "/install/post/otherpkgs/centos7.3/x86_64" --postinstall /opt/xcat/share/xcat/netboot/centos/compute.centos7.postinstall --rootimgdir /install/netboot/centos7.3/x86_64/compute --tempfile /tmp/xcat_genimage.4583 centos7.3-x86_64-netboot-compute 100 blocks /opt/xcat/share/xcat/netboot/centos 100 blocks /opt/xcat/share/xcat/netboot/centos yum -y -c /tmp/genimage.4590.yum.conf --installroot=/install/netboot/centos7.3/x86_64/compute/rootimg/ --disablerepo=* --enablerepo=centos7.3-x86_64-0 install bash dracut-network nfs-utils openssl dhclient kernel openssh-server openssh-clients iputils bc irqbalance procps-ng wget vim-minimal ntp rpm rsync rsyslog e2fsprogs parted net-tools gzip tar xz sh: 1: yum: not found yum invocation failed Error: Command failed: cd /opt/xcat/share/xcat/netboot/centos; ./genimage -a x86_64 -o centos7.3 -p compute --srcdir "/install/centos7.3/x86_64" --pkglist /opt/xcat/share/xcat/netboot/centos/compute.centos7.pkglist --otherpkgdir "/install/post/otherpkgs/centos7.3/x86_64" --postinstall /opt/xcat/share/xcat/netboot/centos/compute.centos7.postinstall --rootimgdir /install/netboot/centos7.3/x86_64/compute --tempfile /tmp/xcat_genimage.4583 centos7.3-x86_64-netboot-compute 2>&1. Error message: 100 blocks /opt/xcat/share/xcat/netboot/centos 100 blocks /opt/xcat/share/xcat/netboot/centos yum -y -c /tmp/genimage.4590.yum.conf --installroot=/install/netboot/centos7.3/x86_64/compute/rootimg/ --disablerepo=* --enablerepo=centos7.3-x86_64-0 install bash dracut-network nfs-utils openssl dhclient kernel openssh-server openssh-clients iputils bc irqbalance procps-ng wget vim-minimal ntp rpm rsync rsyslog e2fsprogs parted net-tools gzip tar xz sh: 1: yum: not found yum invocation failed. I think because yum is used with redhat not Ubuntu. How can I work around this? From: Nora D <uniqu...@live.com> Sent: Saturday, March 4, 2017 3:44 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 Great! I will check the link. Thank you for your help. Sent from my iPhone On Mar 4, 2017, at 6:41 PM, Jarrod Johnson <jjohns...@lenovo.com<mailto:jjohns...@lenovo.com>> wrote: Did you want to do stateless? If so: https://sourceforge.net/p/xcat/wiki/Build_and_Boot_Stateless_Images/ It's probably easiest to let it make the VMs for you. It is possible to use existing VMs with some work, but based on your situation, I think it's probably easier to go with the new vm. From: Nora D [mailto:uniqu...@live.com] Sent: Saturday, March 04, 2017 10:24 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 Thank you so much you saved my life!!! I really appreciate it. I noticed that it creates a new VM I thought it will power off a VM that I created myself. Now the rpower on/off works with no error. However, I am aiming to achieve pxe booting on the vm. Can you help me with that please? When I execute the following command: nodeset vm2 osimage=centos7.3-x86_64-netboot-compute I get the following errors: Error: Did you run "genimage" before running "packimage"? kernel cannot be found at /install/netboot/centos7.3/x86_64/compute/kernel on master.cluster.com<http://master.cluster.com> Error: Some nodes failed to set up netboot resources on server master.cluster.com<http://master.cluster.com>, aborting From: Jarrod Johnson <jjohns...@lenovo.com<mailto:jjohns...@lenovo.com>> Sent: Saturday, March 4, 2017 2:54 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] vm2: Error: Cannot communicate via libvirt to 127.0.0.1 Ah, ok. On your xCAT system as root: cat ~/.ssh/id_rsa.pub This will get root's public key. This is generic ssh setup procedure to establish secure passwordless ssh. You would want to copy that output to clipboard or whatever is most convenient for you. On your desktop: Make sure .ssh exists (it probably does alr
Re: [xcat-user] statefull vs. stateless images
Is it possible to set up a VM on your existing MN? This might allow you to import your current xCAT DB, upgrade to the latest version and try booting a node or two off the VM instance. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Russell Auld [mailto:russa...@comcast.net] Sent: Friday, January 13, 2017 6:34 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] statefull vs. stateless images Upgrading is pretty painless. Unfortunately, there's no way to really know if there have been any code changes that will affect you. I got burned by a couple things when I upgraded from 2.9.x to 2.12.x. 1 - When I added some new nodes, and then ran makedhcp, the new DHCP stanzas were incompatible with my existing leases file. Simply running 'makedhcp -n' fixed the issue. It was never obvious to me that I should have regenerated the leases file after the upgrade. 2 - Some changes to the postbootscripts script resulted in postbootscripts no longer running for certain nodes. The issue was a particular corner-case, and has been fixed in v2.13.x. I think you should be able to revert to v2.9.x if the upgraded version doesn't work for you. You would need to backup your databases since the upgrade process can alter the table schemas. On Fri, 2017-01-13 at 21:10 +, Damir Krstic wrote: > Hi Jarrod, > > Thanks for the prompt answer. I agree with you re. stateless. Next > hardware purchase we will be going statefull. > > to that end, we are running following version of xCAT: > > [root@mgt rh]# rpm -qa |grep -i xcat > conserver-xcat-8.1.16-10.x86_64 > xCAT-2.9.1-snap201503190326.x86_64 > xCAT-genesis-base-x86_64-2.9-snap201504212134.noarch > elilo-xcat-3.14-4.noarch > xCAT-server-2.9.1-snap201503190325.noarch > grub2-xcat-1.0-2.noarch > perl-xCAT-2.9.1-snap201503190325.noarch > xCAT-buildkit-2.9.1-snap201503190326.noarch > ipmitool-xcat-1.8.11-3.x86_64 > xCAT-client-2.9.1-snap201503190325.noarch > xCAT-genesis-scripts-x86_64-2.9.1-snap201503190326.noarch > syslinux-xcat-3.86-2.noarch > > I think in order to deploy statefull version of RH7.3 we will need to > update our xCAT. What is the most painless way of upgrading from our > version to the latest stable RH 7 supporting version? Are there any > gotchas or recommended practices when it comes to upgrade of xCAT? > Last time I had to do this, instead of upgrading, I deployed a new > xCAT server which was not too painful but I don't have the notes of > what I had to do to get it going. > > I would much rather just upgrade the xCAT on this server because the > machine itself is not that old (2 years or so now). > > Anything I should back up before attempting upgrade as well? > > Thanks, > Damir > > > On Fri, Jan 13, 2017 at 9:10 AM Jarrod Johnson <jjohns...@lenovo.com> > wrote: > > I think stateless makes a little less sense over time. > > > > 1) Local boot storage is cheaper and more durable than it used > > to be, and this is only going to get more extreme > > 2) Dynamism is probably better and more easily served by > > somethig like Singularity, which makes things easier for users to do > > their thing without the administrators having to accommodate. > > 3) Mitigating drift can be done in other ways. Stateless has > > traditionally had the side effect of mitigating accumulating ‘drift’ > > as people do things ad-hoc to OS images, by punishing those > > practices. Strictly speaking the same discipline can be self- > > imposed without downside, it just takes some willpower. > > > > From: Damir Krstic [mailto:damir.krs...@gmail.com] > > Sent: Friday, January 13, 2017 9:20 AM > > To: xCAT Users Mailing list > > Subject: [xcat-user] statefull vs. stateless images > > > > We have been running our cluster using stateless images for over 6 > > years now. For the most part, things are running great. There are > > two reasons for our decision to run stateless: > > 1. our compute nodes originally did not have local hard drives 2. we > > envisioned a dynamic environment in which we would boot nodes > > frequently with different images to satisfy different research needs > > > > Today both of those points are invalid / do not apply. All of our > > compute nodes come with hard drives, and we have never really booted > > cluster with any images other than our "production" image. > > In addition, downtimes are really hard to come by in our > > environment, and we treat our cluster as production system. > > > > So, my question is, does it make sense to continue with stateless > > images, or would we be better served with
Re: [xcat-user] Virtual eth interface on Lenovo X3550
If I am not mistaken, this is the IMM's USB interface to the OS (or vice versa). This is used by the advanced settings utility (ASU) to send commands from the local OS to the IMM. In order to disable it, you can either manually do it in the uEFI interface, you may be able to do it from the IMM web or SSH interface, or you can use ASU. For example, on an nx350 m5, the setting is " IMM.LanOverUsb=Enabled." If you run asu locally, you can probably find the setting by running 'asu64 show all | grep -I usb' or with xCAT, you can use 'pasu node show all | grep -I usb'. To change it, rung: locally: asu64 set IMM.LanOverUsb Disabled xCAT: pasu node set IMM.LanOverUsb Disabled Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Andy Loftus [mailto:alof...@illinois.edu] Sent: Wednesday, September 07, 2016 3:54 PM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] Virtual eth interface on Lenovo X3550 Server make/model: Lenovo X 3550 Xcat version: Version 2.11.1 OS: CentOS Linux release 7.2.1511 (Core) On 34 of 47 nodes, after installing a new OS, there is a virtual ethernet interface that looks like: # ip addr show ... enp0s20u1u5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 ... There used to be a 169.254.x.x address associated with it but I think it get's removed by the postscript 'confignics -r', although the interface remains. Anyone know where this comes from and/or how to prevent it? Cheers, --Andy -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Is there an ifcfg-eth postscript that works on systemd OSes?
Will probably want to include "eno" in your grep statement. Not all Etherned devices come up as "ens". Not sure if there are any other varations. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Josh Nielsen [mailto:jniel...@hudsonalpha.org] Sent: Wednesday, August 03, 2016 5:01 PM To: xCAT Users Mailing list; Rich Sudlow Subject: Re: [xcat-user] Is there an ifcfg-eth postscript that works on systemd OSes? Thanks! Yes, I knew ifconfig was deprecated, hence why I knew this was a hack and was asking. Is that postscript something you wrote yourself? Thanks, Josh On Wed, Aug 3, 2016 at 3:49 PM, Rich Sudlow <r...@nd.edu<mailto:r...@nd.edu>> wrote: On 08/03/2016 03:54 PM, Josh Nielsen wrote: Hello, I am now testing the deployment of Centos 7 in my environment and I've noticed that the ifcfg-eth postscript is not geared to work with it. For starters Centos 7 doesn't install ifconfig by default, though I've solved that with my kickstart, but more to the point the postscript explicitly looks for "Ethernet" in the ifconfig line to grab the interface name which doesn't work on systemd OSes like Centos 7. And of course there's the change from "eth" interface names to "ens" and a variety of other names. I changed the line that looked like this in the postscript: interfaces=$(ifconfig -a | grep "Ethernet" | awk '{print $1}') To this: interfaces=$(ifconfig -a | egrep "Ethernet|ens" | awk '{print $1}') And while that does parse out the ens interface names now they come with a colon tacked on to the end of them in the ifconfig output like this: ens160: ens192: I can parse that out with a regex substitution to remove the colon, but before I hack the default script up too much has there been an alternative ifcfg-eth postscript released for systems like this? I'm using this script to change the /etc/sysconfig/network-scripts/ifcfg-* files from using DHCP to the static addresses defined through xCAT, which works fine on my Centos 6 OSes. This is my xCAT version: lsxcatd -v Version 2.11 (git commit 9ea36ca6163392bf9ab684830217f017193815be, built Mon Nov 30 05:43:11 EST 2015) Thanks, Josh Nielsen I believe all the latest xcat routines use ip addr as you might know the use if ifconfig is deprecated... here's a snippet from a postscript which uses ksh if [[ $OSVER = *rhels7* ]]; then # This just hardcodes the entries which are already set # Change to grep only on "inet " rather than "inet addr" so that rhels7 works - RKS - 8/21/2014 for nic in `ifconfig -a|grep -B1 "inet "|awk '{print $1}'|grep -v inet|grep -v -- --|grep -v lo|sed s/:$//`; do echo "Setting up hardeths on rhels7" >> /root/post.log echo NIC $nic echo NIC $nic >> /root/post.log IPADDR=`ifconfig $nic |grep "inet "|awk '{print $2}' |awk -F: '{print $1}'` echo "IPADDR: $IPADDR" >> /root/post.log NETMASK=`ifconfig $nic |grep "inet "|awk '{print $4}' |awk -F: '{print $1}'` echo "NETMASK: $NETMASK" >> /root/post.log sed -i s/BOOTPROTO=dhcp/BOOTPROTO=none/ /etc/sysconfig/network-scripts/ifcfg-$nic sed -i s/ONBOOT=no/ONBOOT=yes/ /etc/sysconfig/network-scripts/ifcfg-$nic echo IPADDR=$IPADDR >> /etc/sysconfig/network-scripts/ifcfg-$nic echo NETMASK=$NETMASK >> /etc/sysconfig/network-scripts/ifcfg-$nic # Remove firewalld since we're using iptables - RKS 9/30/14 yum remove -y firewalld | logger -t xcat echo "Done with hardeths on rhels7" >> /root/post.log done Hope this help you out. -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user -- Rich Sudlow University of Notre Dame Center for Research Computing - Union Station 506 W. South St South Bend, In 46601 (574) 631-7258<tel:%28574%29%20631-7258> (office) (574) 807-1046<tel:%28574%29%20807-1046> (cell) -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] discovering interface names dynamically
A few things to try: First, you can boot to the genesis image (nodeset node shell), which is the same thing that is running after discovery. From there, you can login and use ethtool to identify your interfaces. Second, RHEL 7 variants now use what they call consistent naming. If you can wrap your head around the logic behind it, you can probably predict what the interface names will be. I can't, so… Third, if you don't like the consistent naming, you can add "biosdevname=0 net.ifnames=0" to addkcmdline for your osimage, and RHEL 7 will boot up with good old eth(0, 1, 2…).. Finally, with xCAT 2.12 came the command "getadapter" which will "Scan the network adapters on the compute nodes to determine the predictable naming of the network interfaces when the OS is installed.” I haven't messed with it yet. https://github.com/xcat2/xcat-core/wiki/XCAT_2.12_Release_Notes Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Andrew Loftus [mailto:alof...@illinois.edu] Sent: Thursday, June 30, 2016 1:18 PM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] discovering interface names dynamically Does anyone have any experience or ideas how to acquire (ethernet) interface names dynamically? For instance, in setting up secondary networks, entries in the "nics" table require both the interface name and associated network information. However, these interface names aren't known before the OS is installed (unless one node has already been installed and the interface names checked manually). Furthermore, the interface names change based on a combination of OS version, hardware model, and install type combination. So it's not very robust. My thought is to setup secondary networks using a postbootscript. I'm looking for ideas to scan each ethernet interfaces and discover the network information to setup the secondary networks using that information. Cheers, --Andy -- Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] configuring fpc
You can reset the FPC to factory default. It's an ornery process. If I receall correctly, you remove the FPC for 10-minutes. Remove the battery before reinserting the FPC. Let the FPC run w/o a battery for 10-minutes. Remove thre FPC, replace the battery and reinsert the FPC. It's important to respect the 10-minute guideline. I've seen where a customer was short by a minute or so, and the FPC did not reset to factory default. Once that is done, configfpc should be able to find the default IP. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Damir Krstic [mailto:damir.krs...@gmail.com] Sent: Monday, June 13, 2016 10:57 AM To: xCAT Users Mailing list Subject: [xcat-user] configuring fpc We got an empty N1200 from Lenovo some time back in anticipation of new nodes arriving this summer. In previous times, Lenovo would program the FPCs with some internal address as a part of their cluster solution (172.30.101.141 for example). This empty chassis does not have its IP recorded in the paperwork we received from Lenovo on delivery of the rack. I am trying to configure this FPC using configfpc command and if I try using the defaults: configfpc -i bond0 I get no default IP found. I see the FPC on the switch port and I see its mac: qfivebnt08#show mac-address-table interface port 42 MAC address VLAN PortTrnk State Permanent Openflow - --- - - 6c:ae:8b:5e:56:14 142 FWD N I also have the fpc configured for the right switch port in xCAT: [root@mgt ~]# lsdef qfpc24 Object name: qfpc24 bmc=qfpc24 bmcpassword=PASSW0RD bmcusername=USERID cons=ipmi groups=rack-t22fpc,qfpc,all ip=172.30.11.24 mgt=ipmi nodetype=qfpc postbootscripts=otherpkgs,setupntp postscripts=syslog,remoteshell,syncfiles switch=qfivebnt08 switchport=42 I suspect this FPC is configured with some other IP (other than default) but I don't know what that IP is since it's not documented. Any way of programing the FPC if I don't have the IP? Thanks, Damir -- What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Setting up a bond interface using boot NIC's for stateless RHEL
I would script this either in rc.local on the stateless nodes or a similarly configured systemd script depending on your OS level. Then, just include script in the synclist for your osimage. If using rc.local, make sure to sync it to /etc/rc.d/. If you sync it to /etc/rc.local, it will overwrite the symlink to /etc/rc.d and will not work. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Lopilato, John [mailto:john.lopil...@lmco.com] Sent: Wednesday, June 01, 2016 11:47 AM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] Setting up a bond interface using boot NIC's for stateless RHEL I'm trying to set things up so that when my diskless blades boot, I can configure the node to setup a bond interface once the OS has been downloaded. To make things complicated, the bond interface needs to include the NIC that the node booted over. The goal is to have a bond in failover mode between two NIC's, and I'd ideally set xcat up to be able to boot the same node over either interface. I haven't been able to find any good documentation on how to setup a bond interface for diskless/stateless systems. Can anyone point me in the right direction? Thanks, John Lopilato -- What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment?
I would begin by looking at the servicenode postscript. It sets up the daemon and database communications between SN & MN. Beyond that, the default postscripts are listed in the "xcatdefaults" entry of the postscripts table. You will probably want to run updatenode -k once you have xCAT configured on the new MN. After that, you probably want to rerun the remoteshell and syslog postscripts on the cluster members (updatenode -P) at the very least. Second, you can dump the xCAT DB using dumpxCATdb command. After that, grep out the management node (hostname and/or IP) to see where changes need to be made for the DB on the new MN. If the SNs are handling DHCP, it only needs to be enabled on the MN if you plan in reinstaling a SN. Anything that resolves DNS through the MN will need an updated resolv.conf. Depending on how you're maintaining your /install directory on the SNs, that mechanism will need to be updated. If your MN is routing for any nodes, that will need to be addressed. You might want to check the network configuration on the IMMs. On discovery, if you have a gateway defined on your management network (I believe it defaults to ), they might be pointing to the old MN. Shouldn't be an issue, but it's something to think about. If you're not routing on that network, I would use pasu to set the IMM gateway to 0.0.0.0 and be done with it. The only other concern I can think of would be the installation repos configured on the cluster nodes and SNs. If any point to the MN, they will need to be changed. Aside from all of that, it really depends on the particulars your cluster. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Josh Nielsen [mailto:jniel...@hudsonalpha.org] Sent: Monday, May 02, 2016 8:32 PM To: xCAT Users Mailing list Subject: [xcat-user] How can I migrate to a new xCAT MN in a hierarchical environment? Hello all, My team is trying to move the xCAT MN role off of an old server and get it over onto new virtual infrastructure, but I am a little unsure about whether it is possible to do while leaving everything else in its place as we currently have it in our environment. We have an MN with two SNs for our xCAT environment, and I would need to make the SNs recognize that the new MN (with a new IP and hostname) is now their xcatmaster, and they would need to take hierarchical command updates from the new MN, look to the new MN for the xCAT database (which is a MySQL database in our environment), etc. So a few questions along those lines. 1. Which/how many xCAT database fields would I need to update that use the MN's IP (other than "master" in the site table), and would I have to reinstall or otherwise update anything on the SNs (I imagine restarting the daemons is necessary at a minimum) in case they have anything statically configured for the current MN's IP? 2. Do any default postscripts for deployed clients ever place the MN's hostname or IP in any config files that would require manual alteration if the MN is changed? Our client nodes should, however, have one of the two SNs as their designated xcatmaster, instead of the MN, as shown by an 'lsdef'. 3. And as far as DHCP, the MN does not even need DHCP running if the SNs are handling DHCP, correct? Would I have to change any of my 'networks' table entries and DHCP IP pool config in any case, or should simply dumping and importing the current DB settings in to the new MN instance be seamless? DNS I think (hope) should be an easier matter, since we already have an external DNS server configured that the MN pushes entries to with a 'makedns -e', so no DNS dependency lies on the present MN itself. I imagine I'd have to copy the /etc/hosts from the current MN over to the new though for the makedns (and other things) to continue working. I have attached an image with a simplified sketch of what our xCAT environment looks like. Overall I'm just wondering what changes would I need to make for this to be possible. Thanks for your input. Josh Nielsen -- Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] reventlog : No Mappings found
I meant physically tasking the memory out and putting it back in. Sometimes a bad DIMM connection can cause non-specific memory errors. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Jean-Baptiste Denis [mailto:jbde...@pasteur.fr] Sent: Monday, March 14, 2016 5:35 PM To: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] reventlog : No Mappings found On 03/14/2016 04:55 PM, Christian Caruthers wrote: > Have you tried reseating the memory in the system? What do you mean exactly ? reventlog node clear ? -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785231=/4140 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785231=/4140 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] reventlog : No Mappings found
Have you tried reseating the memory in the system? Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Jean-Baptiste Denis [mailto:jbde...@pasteur.fr] Sent: Monday, March 14, 2016 11:48 AM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] reventlog : No Mappings found Hello, I've got those kind of entries for a node in reventlog : $ reventlog tars-407 tars-407: 03/11/2016 19:44:54 No Mappings found (CCh), No Mappings found (00h) (Sensor 0xff) tars-407: 03/11/2016 20:53:31 No Mappings found (CCh), No Mappings found (00h) (Sensor 0xff) If I go to the IPMI admin web interface, I can see the corresponding entries : 1 2016/03/11 19:44:50 Memory ErrorBIOS OEMFailing DIMM: DIMM location (Correctable memory component found) (P2-DIMME1) 2 2016/03/11 20:53:27 Memory ErrorBIOS OEMFailing DIMM: DIMM location (Correctable memory component found) (P2-DIMME1) Where is the culprit ? =) I guess the IPMI implementation could be non compliant. What do you think ? How should we managed this ? Thank you ! Jean-Baptiste -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785231=/4140 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785231=/4140 ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] 回复: 回复: Other repos in stateless images
My goal was to use a repository accessible to the MN (EPEL) without the hassle of downloading the packages I need from EPEL along with their dependencies and maintaing a local repo for use with otherpkgs. Seems like more of a hassle than simply being able to point to the EPEL repo in a config file. For example, with a statful install, I can include additional repos that may not be on the MN, but are accessible to the node I'm installing. I'm looking for a way to do this same thing with a stateless install. I see where the yum command in genimage calls to /tmp/genimage.PID.conf. This file is merely a yum repo config. How would I go about getting the EPEL info there? Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Victor Hu [mailto:v...@us.ibm.com] Sent: Friday, February 05, 2016 7:19 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] 回复: 回复: Other repos in stateless images Christian, Just to follow up ... The idea is that the pkglist will handle packages provided on the base OS and otherpkglist will handle things that are manually added. Keeping those separate will make it easier to manage. If something goes wrong with the files in /install/ you can delete the directory and re-run copycds. Thanks, Victor From:"Xiao Peng Wang" <w...@cn.ibm.com<mailto:w...@cn.ibm.com>> To:"xCAT Users Mailing list" <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Cc:"xCAT Users Mailing list" <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Date:02/04/2016 11:33 PM Subject:[xcat-user] 回复: 回复: Other repos in stateless images It's not replicate, any packages downloaded from outside should use additional repo. That's easy to manage. Using IBM Verse, send from my iPhone. 在 2016年2月5日,11:52:00,"Christian Caruthers" <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> 写道: I have that working. Is it possible to just use the existing repo rather than replicating it on the management node? Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Xiao Peng Wang [mailto:w...@cn.ibm.com] Sent: Thursday, February 04, 2016 10:41 PM To: xCAT Users Mailing list Cc: xCAT Users Mailing list (xcat-user Subject: [xcat-user] 回复:Other repos in stateless images If you want to use Epel at local, you should download the packages to local and using otherpkgs to install. 已使用 IBM Verse 从我的 iPhone 发送 在2016年2月5日,10:30:34,"Christian Caruthers" <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> 写道: /* Font Definitions */ @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2<tel:2%2015%205%202%202%202%204%203%202> 4;} @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2<tel:2%2011%206%204%203%205%204%204%202> 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri","sans-serif";} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} p.MsoAcetate, li.MsoAcetate, div.MsoAcetate {mso-style-priority:99; mso-style-link:"Balloon Text Char"; margin:0in; margin-bottom:.0001pt; font-size:8.0pt; font-family:"Tahoma","sans-serif";} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Calibri","sans-serif"; color:windowtext;} span.BalloonTextChar {mso-style-name:"Balloon Text Char"; mso-style-priority:99; mso-style-link:"Balloon Text"; font-family:"Tahoma","sans-serif";} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri","sans-serif";} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} --> I have a MN running 2.11 with latest deps. Trying to add EPEL repo to a stateless image. The MN is configured to see EPEL, but it's not local - I haven't run copycds on anything to get the EPEL repo. Is there another way to call to it? Whe n I include a package from the EPEL repo in the profile.pkglist it is skipped. The only way I can get it to install is via otherpkgs. [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Christian Caruthers Enterprise IT Consultant xESS Lenovo US [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/Email%20Gifs/T.gif]757-289-9872 [http://lenovocentral.lenovo.com/m
Re: [xcat-user] 回复: Unable to boot nodes on new xCAT system
XNBA stands for XCAT Network Boot Agent and is basically an enhanced gPXE. It's become the default for use with System x machines. From google: The xCAT Network Boot Agent is a slightly modified version of gPXE. It provides enhanced boot features for any UNDI compliant x86 host. This includes iSCSI, http/ftp downloads, and gPXE script based booting. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Jeff White [mailto:jeff.wh...@wsu.edu] Sent: Friday, February 05, 2016 4:32 PM To: xcat-user@lists.sourceforge.net Subject: Re: [xcat-user] 回复: Unable to boot nodes on new xCAT system Casandra, as I stated I don't know what XNBA is. Therefor, I doubt I'm using it. If you need more information please let me know how to get that information from xcat. Jeff White HPC Systems Engineer Information Technology Services - WSU On 02/05/2016 12:41 PM, Casandra H Qiu wrote: Just for clarification, did u use xnba boot and hit that "no space left on device" issue? Thanks, Casandra ... Casandra Hong Qiu Phone: (845) 433-9291, t/l 293-9291 Office: B/002, Floor 3, Z13 cxh...@us.ibm.com<mailto:cxh...@us.ibm.com> [Inactivehide details for Jeff White ---02/05/2016 11:05:43 AM---Mostlybecause I don't know what it is and PXE works for me. I]Jeff White ---02/05/2016 11:05:43 AM---Mostly because I don't know what it is and PXE works for me. In any case I'm past that issue. Howev From: Jeff White <jeff.wh...@wsu.edu><mailto:jeff.wh...@wsu.edu> To: xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net> Date: 02/05/2016 11:05 AM Subject: Re: [xcat-user] 回复: Unable to boot nodes on new xCAT system Mostly because I don't know what it is and PXE works for me. In any case I'm past that issue. However now the node begins booting but dies with various "no space left on device" errors on /sysroot/var. Any clues? Jeff White HPC Systems Engineer Information Technology Services - WSU On 02/04/2016 07:43 PM, Xiao Peng Wang wrote: Why not using xnba but pxe? 已使用 IBM Verse 从我的 iPhone 发送 在 2016年2月5日,07:05:22,"Jeff White" <jeff.wh...@wsu.edu><mailto:jeff.wh...@wsu.edu> 写道: I have a working xCAT box on CentOS 7 which I am moving to another CentOS 7 box. I did a clean install of xCAT then migrated node images and such. I'm adding nodes manually and the first one won't boot. It fails with: Trying to load: pxelinux.cfg/0A6E04 Could not find kernel image: xcat/nbk.x86_64 That hex is the first 3 octets of the IP, why wouldn't it be the entire IP? I can see xcat configured tftp whatevers like so: # ls -l /tftpboot/pxelinux.cfg/ total 24 -rw-r--r-- 1 root root 118 Jan 27 11:39 0A6E04 lrwxrwxrwx 1 root root 3 Feb 4 14:06 0A6E0414 -> cn0 -rw-r--r-- 1 root root 118 Jan 27 11:39 0A6E05 -rw-r--r-- 1 root root 118 Jan 27 11:39 0A6E06 -rw-r--r-- 1 root root 118 Jan 27 11:39 0A6E07 -rw-r--r-- 1 root root 115 Jan 27 11:39 7F -rw-r--r-- 1 root root 380 Feb 4 14:06 cn0 The node is "cn0". So, how do I get the node to look for 0A6E0414 instead of 0A6E04? ... or how can I make xcat work with 0A6E04? -- Jeff White HPC Systems Engineer Information Technology Services - WSU -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140<https://urldefense.proofpoint.com/v2/url?u=http-3A__pubads.g.doubleclick.net_gampad_clk-3Fid-3D272487151-26iu-3D_4140=CwMGaQ=C3yme8gMkxg_ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw=DhM5WMgdrH-xWhI5BzkRTzoTvz8C-BRZ05t9kW9SXZk=6bxPx7GZQ2c1uOKOX9qfwov3Nxg3jupnFB7HQNL3gDE=cglmeJ6DB7QiyERWkGzjIfD5mR5t0eqzdKMeW6tLxRM=> ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=CwMGaQ=C3yme8gMkxg_ihJNXS06ZyWk4EJm8LdrrvxQb-Je7sw=DhM5WMgdrH-xWhI5BzkRTzoTvz8C-BRZ05t9kW9SXZk=6bxPx7GZQ2c1uOKOX9qfwov3Nxg3jupnFB7HQNL3gDE=zDVfhmSVjv-3T2FhYU2FOpwyeVnFB_aMn03BAruYcMY=> -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! https://urldefense.proofpoi
[xcat-user] Other repos in stateless images
I have a MN running 2.11 with latest deps. Trying to add EPEL repo to a stateless image. The MN is configured to see EPEL, but it's not local - I haven't run copycds on anything to get the EPEL repo. Is there another way to call to it? Whe n I include a package from the EPEL repo in the profile.pkglist it is skipped. The only way I can get it to install is via otherpkgs. [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Christian Caruthers Enterprise IT Consultant xESS Lenovo US [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/Email%20Gifs/T.gif]757-289-9872 [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/Email%20Gifs/E.gif]ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com> Lenovo.com <http://www.lenovo.com/> Twitter<http://twitter.com/lenovo> | Facebook<http://www.facebook.com/lenovo> | Instagram<https://instagram.com/lenovo> | Blogs<http://blog.lenovo.com/> | Forums<http://forums.lenovo.com/> [LenovoSystemX] -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] 回复: Other repos in stateless images
I have that working. Is it possible to just use the existing repo rather than replicating it on the management node? Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Xiao Peng Wang [mailto:w...@cn.ibm.com] Sent: Thursday, February 04, 2016 10:41 PM To: xCAT Users Mailing list Cc: xCAT Users Mailing list (xcat-user Subject: [xcat-user] 回复: Other repos in stateless images If you want to use Epel at local, you should download the packages to local and using otherpkgs to install. 已使用 IBM Verse 从我的 iPhone 发送 在 2016年2月5日,10:30:34,"Christian Caruthers" <ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com>> 写道: /* Font Definitions */ @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0in; margin-bottom:.0001pt; font-size:11.0pt; font-family:"Calibri","sans-serif";} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} p.MsoAcetate, li.MsoAcetate, div.MsoAcetate {mso-style-priority:99; mso-style-link:"Balloon Text Char"; margin:0in; margin-bottom:.0001pt; font-size:8.0pt; font-family:"Tahoma","sans-serif";} span.EmailStyle17 {mso-style-type:personal-compose; font-family:"Calibri","sans-serif"; color:windowtext;} span.BalloonTextChar {mso-style-name:"Balloon Text Char"; mso-style-priority:99; mso-style-link:"Balloon Text"; font-family:"Tahoma","sans-serif";} .MsoChpDefault {mso-style-type:export-only; font-family:"Calibri","sans-serif";} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in;} div.WordSection1 {page:WordSection1;} --> I have a MN running 2.11 with latest deps. Trying to add EPEL repo to a stateless image. The MN is configured to see EPEL, but it's not local - I haven't run copycds on anything to get the EPEL repo. Is there another way to call to it? Whe n I include a package from the EPEL repo in the profile.pkglist it is skipped. The only way I can get it to install is via otherpkgs. [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif] Christian Caruthers Enterprise IT Consultant xESS Lenovo US [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/Email%20Gifs/T.gif]757-289-9872 [http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/Email%20Gifs/E.gif]ccaruth...@lenovo.com<mailto:ccaruth...@lenovo.com> Lenovo.com <http://www.lenovo.com/> Twitter<http://twitter.com/lenovo> | Facebook<http://www.facebook.com/lenovo> | Instagram<https://instagram.com/lenovo> | Blogs<http://blog.lenovo.com/> | Forums<http://forums.lenovo.com/> [LenovoSystemX] -- Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151=/4140___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] The logo of xCAT
Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Jarrod Johnson [mailto:jjohns...@lenovo.com] Sent: Thursday, October 15, 2015 1:14 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] The logo of xCAT xCAT I'm glad we saved room in unicode for so many variants on cats... Wondering why U+1F408 doesn't seem to work anywhere... -Original Message- From: John van Ommen [mailto:john.vanom...@gmail.com] Sent: Thursday, October 15, 2015 1:03 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] The logo of xCAT Here's my contribution: 【=◈︿◈=】 X C A T It's an ascii cat logo that I stole off of the cover of 'Worlds' On Thu, Oct 15, 2015 at 11:21 AM, Christian Caruthers <ccaruth...@lenovo.com> wrote: > We've gone off the rails! > > Regards, > Christian Caruthers > Lenovo xESS IT Consultant > Mobile: 757-289-9872 > > > -Original Message- > From: Rich Sudlow [mailto:r...@nd.edu] > Sent: Thursday, October 15, 2015 9:20 AM > To: xCAT Users Mailing list > Subject: Re: [xcat-user] The logo of xCAT > > On 10/15/2015 04:26 AM, Christopher Brown wrote: >> Well, in an ideal world you would use the Hello Kitty logo because it >> turns out that although everyone thinks its a kitten it's actually a >> drawing of a British public school girl [1] therefore making Hello >> Kitty ... an ex-cat! >> >> :) >> >> 1. >> http://www.bbc.co.uk/newsbeat/article/28963085/hello-kitty-is-not-a-c >> a >> t---shes-a-british-school-kid > > ;-) I love it!!! > >> >> On Wed, 2015-10-14 at 16:48 +0100, Jarrod Johnson wrote: >>> Wow, people have thought about this a lot more than I ever did >>> >>> -Original Message- >>> From: Thomas Orgis [mailto:thomas.or...@uni-hamburg.de] >>> Sent: Wednesday, October 14, 2015 11:30 AM >>> To: xcat-user@lists.sourceforge.net >>> Subject: Re: [xcat-user] The logo of xCAT >>> >>> Am Wed, 14 Oct 2015 14:24:20 + >>> schrieb Mark Loveridge <ma...@slb.com>: >>> >>>> Not to mention the doens of files with xcat in the name in the product. >>>> Whilst I may rarely if ever see the logo I see the name every day. >>> >>> Also, it's the x as in Unix (trademark or not), plus the command cat. I >>> understood xCAT to mean `cat < $UNIX_LIKE_SYSTEM > $CLUSTER`. >>> >>> >>> Alrighty then, >>> >>> Thomas >>> >>> -- >>> Dr. Thomas Orgis >>> Universität Hamburg >>> RRZ / Zentrale Dienste / HPC >>> Schlüterstr. 70 >>> 20146 Hamburg >>> Tel.: 040/42838 8826 >>> Fax: 040/428 38 6270 >>> >>> >>> - >>> - ___ >>> xCAT-user mailing list >>> xCAT-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/xcat-user >>> >>> - >>> - ___ >>> xCAT-user mailing list >>> xCAT-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/xcat-user >> >> -- >> Regards, >> >> Christopher Brown >> Openstack Engineer >> OCF plc >> >> Tel: +44 (0)114 257 2200 >> Web: www.ocf.co.uk >> Blog: blog.ocf.co.uk >> Twitter: @ocfplc >> >> Please note, any emails relating to an OCF Support request must >> always be sent to supp...@ocf.co.uk for a ticket number to be >> generated or existing support ticket to be updated. Should this not >> be done then OCF cannot be held responsible for requests not dealt >> with in a timely manner. >> >> OCF plc is a company registered in England and Wales. Registered >> number 4132533, VAT number GB 780 6803 14. Registered office address: >> OCF plc, >> 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield >> S35 2PG. >> >> This message is private and confidential. If you have received this >> message in error, please notify us immediately and remove it from >> your system. >> >> >> >> >> --- >> This email has been checked for viruses by Avast antivirus software. >> https://www.avast.com/antivirus >> - >> - >> __
Re: [xcat-user] The logo of xCAT
We've gone off the rails! Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Rich Sudlow [mailto:r...@nd.edu] Sent: Thursday, October 15, 2015 9:20 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] The logo of xCAT On 10/15/2015 04:26 AM, Christopher Brown wrote: > Well, in an ideal world you would use the Hello Kitty logo because it > turns out that although everyone thinks its a kitten it's actually a > drawing of a British public school girl [1] therefore making Hello > Kitty ... an ex-cat! > > :) > > 1. > http://www.bbc.co.uk/newsbeat/article/28963085/hello-kitty-is-not-a-ca > t---shes-a-british-school-kid ;-) I love it!!! > > On Wed, 2015-10-14 at 16:48 +0100, Jarrod Johnson wrote: >> Wow, people have thought about this a lot more than I ever did >> >> -Original Message- >> From: Thomas Orgis [mailto:thomas.or...@uni-hamburg.de] >> Sent: Wednesday, October 14, 2015 11:30 AM >> To: xcat-user@lists.sourceforge.net >> Subject: Re: [xcat-user] The logo of xCAT >> >> Am Wed, 14 Oct 2015 14:24:20 + >> schrieb Mark Loveridge <ma...@slb.com>: >> >>> Not to mention the doens of files with xcat in the name in the product. >>> Whilst I may rarely if ever see the logo I see the name every day. >> >> Also, it's the x as in Unix (trademark or not), plus the command cat. I >> understood xCAT to mean `cat < $UNIX_LIKE_SYSTEM > $CLUSTER`. >> >> >> Alrighty then, >> >> Thomas >> >> -- >> Dr. Thomas Orgis >> Universität Hamburg >> RRZ / Zentrale Dienste / HPC >> Schlüterstr. 70 >> 20146 Hamburg >> Tel.: 040/42838 8826 >> Fax: 040/428 38 6270 >> >> - >> - ___ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user >> - >> - ___ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user > > -- > Regards, > > Christopher Brown > Openstack Engineer > OCF plc > > Tel: +44 (0)114 257 2200 > Web: www.ocf.co.uk > Blog: blog.ocf.co.uk > Twitter: @ocfplc > > Please note, any emails relating to an OCF Support request must always > be sent to supp...@ocf.co.uk for a ticket number to be generated or > existing support ticket to be updated. Should this not be done then > OCF cannot be held responsible for requests not dealt with in a timely > manner. > > OCF plc is a company registered in England and Wales. Registered > number 4132533, VAT number GB 780 6803 14. Registered office address: > OCF plc, > 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield S35 > 2PG. > > This message is private and confidential. If you have received this > message in error, please notify us immediately and remove it from your > system. > > > > > --- > This email has been checked for viruses by Avast antivirus software. > https://www.avast.com/antivirus > -- > ___ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xcat-user > -- Rich Sudlow University of Notre Dame Center for Research Computing - Union Station 506 W. South St South Bend, In 46601 (574) 631-7258 (office) (574) 807-1046 (cell) -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] The logo of xCAT
Adding on (mohawk optional): ^_|||_^ 【=◈︿◈=】 X C A T Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: John van Ommen [mailto:john.vanom...@gmail.com] Sent: Thursday, October 15, 2015 1:03 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] The logo of xCAT Here's my contribution: 【=◈︿◈=】 X C A T It's an ascii cat logo that I stole off of the cover of 'Worlds' On Thu, Oct 15, 2015 at 11:21 AM, Christian Caruthers <ccaruth...@lenovo.com> wrote: > We've gone off the rails! > > Regards, > Christian Caruthers > Lenovo xESS IT Consultant > Mobile: 757-289-9872 > > > -Original Message- > From: Rich Sudlow [mailto:r...@nd.edu] > Sent: Thursday, October 15, 2015 9:20 AM > To: xCAT Users Mailing list > Subject: Re: [xcat-user] The logo of xCAT > > On 10/15/2015 04:26 AM, Christopher Brown wrote: >> Well, in an ideal world you would use the Hello Kitty logo because it >> turns out that although everyone thinks its a kitten it's actually a >> drawing of a British public school girl [1] therefore making Hello >> Kitty ... an ex-cat! >> >> :) >> >> 1. >> http://www.bbc.co.uk/newsbeat/article/28963085/hello-kitty-is-not-a-c >> a >> t---shes-a-british-school-kid > > ;-) I love it!!! > >> >> On Wed, 2015-10-14 at 16:48 +0100, Jarrod Johnson wrote: >>> Wow, people have thought about this a lot more than I ever did >>> >>> -Original Message- >>> From: Thomas Orgis [mailto:thomas.or...@uni-hamburg.de] >>> Sent: Wednesday, October 14, 2015 11:30 AM >>> To: xcat-user@lists.sourceforge.net >>> Subject: Re: [xcat-user] The logo of xCAT >>> >>> Am Wed, 14 Oct 2015 14:24:20 + >>> schrieb Mark Loveridge <ma...@slb.com>: >>> >>>> Not to mention the doens of files with xcat in the name in the product. >>>> Whilst I may rarely if ever see the logo I see the name every day. >>> >>> Also, it's the x as in Unix (trademark or not), plus the command cat. I >>> understood xCAT to mean `cat < $UNIX_LIKE_SYSTEM > $CLUSTER`. >>> >>> >>> Alrighty then, >>> >>> Thomas >>> >>> -- >>> Dr. Thomas Orgis >>> Universität Hamburg >>> RRZ / Zentrale Dienste / HPC >>> Schlüterstr. 70 >>> 20146 Hamburg >>> Tel.: 040/42838 8826 >>> Fax: 040/428 38 6270 >>> >>> >>> - >>> - ___ >>> xCAT-user mailing list >>> xCAT-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/xcat-user >>> >>> - >>> - ___ >>> xCAT-user mailing list >>> xCAT-user@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/xcat-user >> >> -- >> Regards, >> >> Christopher Brown >> Openstack Engineer >> OCF plc >> >> Tel: +44 (0)114 257 2200 >> Web: www.ocf.co.uk >> Blog: blog.ocf.co.uk >> Twitter: @ocfplc >> >> Please note, any emails relating to an OCF Support request must >> always be sent to supp...@ocf.co.uk for a ticket number to be >> generated or existing support ticket to be updated. Should this not >> be done then OCF cannot be held responsible for requests not dealt >> with in a timely manner. >> >> OCF plc is a company registered in England and Wales. Registered >> number 4132533, VAT number GB 780 6803 14. Registered office address: >> OCF plc, >> 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield >> S35 2PG. >> >> This message is private and confidential. If you have received this >> message in error, please notify us immediately and remove it from >> your system. >> >> >> >> >> --- >> This email has been checked for viruses by Avast antivirus software. >> https://www.avast.com/antivirus >> - >> - >> ___ >> xCAT-user mailing list >> xCAT-user@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xcat-user >> > > > -- > Rich Sudlow > University of Notre Dame > Center for Research Computing - Union Station > 506 W. Sou
Re: [xcat-user] The logo of xCAT
Perhaps we could discuss a different name? Seems like we're controting ourselves to inlcude the 'x' when it's really unnecessary. Maybe instead of "extreme", which is kind of dated, we could use a cool buzz word like "scalable". Scalable Cluster/Cloud Administration Toolkit How's that for an accronym? ;-) Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Douglas Myers [mailto:dgmy...@us.ibm.com] Sent: Tuesday, October 13, 2015 11:41 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] The logo of xCAT I'm rather ambivalent about the old logo, I hazily recall wondering if it wasn't too, er, extreme, when it was first seen, but got over that quickly enough, and for the last 5-6 years haven't thought much about it, either. That said, it's miles ahead of the proposed logo, IMNSHO. That one I actively dislike. To echo Jarrod though, I probably would quit seeing it soon enough. I do like the idea of formalizing this, getting more submissions and setting up a vote. _ Douglas Myers, IBM Special Events - Smart Cloud BMS Lead _ "It's not an opportunity if it doesn't scare you a little bit" [Inactive hide details for Arif Ali ---10/13/2015 08:09:40 AM---Thomas, You'll see it here ...]Arif Ali ---10/13/2015 08:09:40 AM---Thomas, You'll see it here ... From: Arif Ali <a...@ocf.co.uk<mailto:a...@ocf.co.uk>> To: <xcat-user@lists.sourceforge.net<mailto:xcat-user@lists.sourceforge.net>> Date: 10/13/2015 08:09 AM Subject: Re: [xcat-user] The logo of xCAT Thomas, You'll see it here ... https://github.com/xcat2/xcat-core/blob/master/xCAT-server/share/xcat/netboot/rh/genimage#L1062 I agree with Rich and Christian, I like the old logo, but needs a bit of work,. Unfortunately for me, the new one doesn't seem to catch me On 13/10/15 15:42, Thomas Orgis wrote: Am Tue, 13 Oct 2015 14:27:32 + schrieb Christian Caruthers <ccaruth...@lenovo.com><mailto:ccaruth...@lenovo.com>: I believe the old ascii cat still pops up when running genimage or booting a stateless system. Not seeing it during boot (using replaycons) on our stateless systems. Alrighty then, Thomas -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user This email has been checked for viruses by Avast antivirus software. www.avast.com<http://www.avast.com/> -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net<mailto:xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] The logo of xCAT
I second what Rich said. While the cat itself is cartoonish, the name looks alright. What's the urgency about changing the logo? If anything, possibly look at cleaning up the existing logo a bit. Another idea might be to use the old ascii art cat. Maybe capture an image of that, convert it to vector and resize it. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Rich Sudlow [mailto:r...@nd.edu] Sent: Tuesday, October 13, 2015 9:52 AM To: xCAT Users Mailing list Subject: Re: [xcat-user] The logo of xCAT On 10/13/2015 09:41 AM, Xiao Peng Wang wrote: > Several weeks ago we had some discussion to make a new logo for xCAT. > Now we got a new one in this page (_http://xcat.org/index2.html_), you > can compare it with the current one here (http://xcat.org/) OMG - You have got to be kiddingyou have people in our office roaring with laughter at the absurdity of this!! > > Someone thought the new one is too cute. What's your opinion? Keep the old one!! > > I remember someone preferred the current logo, so maybe we can just > refine it to make it more modern? There was more than one who preferred the current logo - I was one of them... What's wrong with the current one - just call it retro ;-) Rich - AKA retro Rich ;-) -- Rich Sudlow University of Notre Dame Center for Research Computing - Union Station 506 W. South St South Bend, In 46601 (574) 631-7258 (office) (574) 807-1046 (cell) > > > Thanks > Best Regards > -- > Wang Xiaopeng (王晓朋) > IBM China System Technology Laboratory > Tel: 86-10-82453455 > Email: w...@cn.ibm.com > Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, > Haidian District Beijing P.R.China 100193 > > > -- > > > > > ___ > xCAT-user mailing list > xCAT-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xcat-user > -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] KickStart for Centos 7
Can't remember if there's a reference file in the DVD (on in the ISO). I think there may be, but I don't know where. You could install a VM and grab the anaconda.cf file from the root home directory. Also, there are options in /opt/xcat/share/xcat/install/rh/ once you have xCAT installed. Finally, there's the RH documentation: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Installation_Guide/sect-kickstart-syntax.html Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 -Original Message- From: Langton [mailto:langt...@eclipseholdings.co.za] Sent: Monday, September 28, 2015 11:23 AM To: xCAT Users Mailing list Subject: [xcat-user] KickStart for Centos 7 What is the best way to get a kickstart file for Centos 7. Would like to believe that it is different to centos 6. Regards Langton -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Is there a better way to set the hostname with netboot?
In the past, I've used a postscript to reset the hostname based on either IP, xCAT node name, whatever. In RH, edit the HOSTNAME line in /etc/sysconfig/network, SLES uses /etc/HOSTNAME. Looks like CoreOS uses hostnamectl command. Regards, Christian Caruthers Lenovo xESS IT Consultant Mobile: 757-289-9872 From: Devon Peters [mailto:devon.pet...@jivesoftware.com] Sent: Wednesday, August 19, 2015 4:42 PM To: xCAT Users Mailing list Subject: [xcat-user] Is there a better way to set the hostname with netboot? Background - we've got location-specific xcat node names for our servers, which are used with discovery. When we install an OS on the nodes, the hostname we configure in hosts.hostnames is set as the final hostname, and all is well. The issue we're seeing though, is when we netboot our diskless CoreOS systems the hostname is set to the xcat node name, rather than the first hosts.hostnames like we want. I've found that I can work around this by using: makedhcp node022 -s 'supersede host-name = \therealhostname\;' Though, if someone runs 'makedhcp -a', these customizations get nuked... I'm curious if there's a better or recommended way to set the hostname for these sort of systems? -devon -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Preserving image configurations.
Seems like a documentation issue rather than a need for a new feature. I've always done this with syncfiles since it's easier to propagate user group changes using updatenode. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 From: Jarrod Johnson [mailto:jjohns...@lenovo.com] Sent: Monday, April 27, 2015 3:59 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] Preserving image configurations. This has come up before. I wonder what people on the list think of adding '-leavepassword' argument to packimage... Jarrod Johnson HPC Systems Management Architect Lenovo From: Ling Gao [mailto:ling...@us.ibm.com] Sent: Monday, April 27, 2015 3:07 PM To: xCAT Users Mailing list Subject: Re: [xcat-user] Preserving image configurations. Hi, You can use either of the following ways. 1. Put it in the image using postinstall script. When you look at the image def, there is a postinstall script. For example: # lsdef -t osimage rhels7.1-ppc64-netboot-compute Object name: rhels7.1-ppc64-netboot-compute exlist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64.exlist imagetype=linux osarch=ppc64 osdistroname=rhels7.1-ppc64 osname=Linux osvers=rhels7.1 otherpkgdir=/install/post/otherpkgs/rhels7.1/ppc64 permission=755 pkgdir=/install/rhels7.1/ppc64 pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64.pkglist postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64.postinstall profile=compute provmethod=netboot rootimgdir=/install/netboot/rhels7.1/ppc64/compute You can move /etc/password and /etc/shadow files to the image in this script postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.ppc64.postinstall. The postinstall script will be run at the end of genimage time. 2. sync the /etc/password and /etc/shadow down to the node after the deployment using syncfiles postscript. http://sourceforge.net/p/xcat/wiki/Sync-ing_Config_Files_to_Nodes/ Ling Ling Gao Poughkeepsie Unix Development Lab IBM Systems and Technology Group Internal: T/L 293-5692 External: ling...@us.ibm.commailto:ling...@us.ibm.com, 845-433-5692 I never worry about the future. It comes soon enough. --- Albert Einstein From:Zentz, Scott C. ze...@email.unc.edumailto:ze...@email.unc.edu To: xcat-user@lists.sourceforge.netmailto:xcat-user@lists.sourceforge.net xcat-user@lists.sourceforge.netmailto:xcat-user@lists.sourceforge.net Date:04/27/2015 12:36 PM Subject:[xcat-user] Preserving image configurations. Hello List! We have a server that we would like for all the existing configurations to be placed in a stateless image then deploy out to our servers. Right now genimage seems to recreate /etc/password /etc/shadow and various other files that we would like to preserve. Is that something that we should automate with postscripts? Or is it possible to keep that intact without postscripts? Thanks! -scz-- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ xCAT-user mailing list xCAT-user@lists.sourceforge.netmailto:xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] Wiki Edit
Adding a module to the genesis kernel, as outlined here: https://sourceforge.net/p/xcat/wiki/XCAT_iDataPlex_Advanced_Setup/#adding-drivers-to-the-genesis-boot-kernel Can result in the following error: module signed with unknown key Could a note be added to the wiki to use objcopy -R .note.module.sig module.ko new_module.ko as a means to bypass this error? Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] problem with bmc programming
Damir, I can think of 3 troubleshooting routes: 1. Load factory defaults, boot the system to the genesis kernel (nodeset NODE shell) and run bmcsetup 2. Pull power from the box and plug it back in to reboot the IMM. 3. Create Bootable Media Creator thumb drive and force it to flash the IMM. If none of that works, you might need to open a service call to replace the system board. Pull a DSA because they'll probably ask for it. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 From: Damir Krstic [mailto:damir.krs...@gmail.com] Sent: Wednesday, April 15, 2015 2:22 PM To: xCAT Users Mailing list Subject: [xcat-user] problem with bmc programming one of our new nodes was just provisioned and I am having an issue programming bmc. we are using dedicated imm port on this 3650m4 server. imm port is plugged in to a switch with single vlan. imm interface is configured with following settings: IP Address Source : Static Address IP Address : 172.29.9.1 Subnet Mask : 255.255.0.0 MAC Address : 40:f2:e9:cd:bf:df SNMP Community String : public Here is the picture of the actual imm settings in the uefi i can't ping/telnet this interface at all. tcpdump basically shows me that the management node is asking who has the mac address of this node. i have logged in to the switch itself and this mac is not showing in the mac table on the switch. other interfaces (non imm) that are configured on this server and plugged in to the same switch function properly and are accessible with ssh/telnet etc. any help is appreciated. damir -- BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15utm_medium=emailutm_campaign=VA_SF___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] install ib driver
The mellanox ib driver/OFED stack is also available as a tarball from the Mellanox download site. http://www.mellanox.com/page/products_dyn?product_family=26 Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 From: grihpc [mailto:gri...@126.com] Sent: Monday, March 30, 2015 9:54 AM To: xcat-user@lists.sourceforge.net Subject: [xcat-user] install ib driver mellanox ib driver is a iso file,how to install with os? -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] managing names in a multiple network environment.
I don't know of a way to make xCAT work the way you're asking, but to get around this in the past I have gone 1 of 2 routes: 1. Manually define all external hostnames such that external routable hostnames do not have a -iface ( ed. -eth1) suffix. Then I set the system hostname in postscripting by editing /etc/sysconfig/network file (/etc/HOSTNAME in IIRC) before the reboot. This would leave me with something like login01.cluster.net on the internal net and hpclogin01.customer.com on the routable customer net. 2. Define all hosts in the hosts table and use the hosts.hostnames field to define aliases. In this instance, my internal hostname is the same as above, but my external hostname would be login01-eth0.customer.com with an alias (defined on the xCAT MN) of hpclogin01.customer.com. Again, I would set the system hostname in postscripting the same as above. As for setting internal hostnames, leaving off the suffix makes administrative tasks on larger clusters somewhat easier (eg. nodeset node1-node4000 shell), but there's nothing saying you can't have your nodes named the way you want them while maintaining the use of node ranges: nodeadd node[001-100]-e0 groups=... nodeset node[001-100]-e0 In some instances, I've even created internal node names that cumbersome and made each node a member of it's own group: nodeadd plcompts1a groups=node001,compute... The only limitation to that approach that I've seen is you can't use 'rcons node001' since 'node001' is a group and rcons will only work on a single node. 'wcons' works fine with this config. Hope that helps. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 -Original Message- From: Allison Andrews [mailto:akandr...@lbl.gov] Sent: Friday, March 06, 2015 6:34 PM To: xCAT-user@lists.sourceforge.net Subject: [xcat-user] managing names in a multiple network environment. Hi, I was wondering if someone could answer one of my xCAT questions. In xCAT, you give a node a name, and xcat by default associates that name with the IP address that the node installs itself via. In our world, This will be a private management interface on an unroutable network(10.0.0.0/24 for example) which should be named nodename-e0. When generating /etc/hosts etc, xCAT will assign the base hostname to that IP address(e.g. nodename would be 10.0.0.32, rather than the public routeable interface 128.xx.yy.zz) and allow you to name other interfaces by either adding a prefix, or suffix to the name on a per-interface basis. this scheme would result in a hosts table like: 10.0.0.32 nodenamenodename.domainname # install interface. 128.xx.yy.zz nodename-pub nodename-pub.domainname # additional routeable interface I'd like the routeable interface to get the unadorned name, and for the management net to get an -e0 suffix. This would result in a hosts table like: 10.0.0.32 nodename-e0 nodename-e0.domainname # install interface. 128.xx.yy.zz nodename nodename.domainname # additional routeable interface I'd expect this would be a problem for anyone wanting to install over a non-routable management network(you'd presumably want users coming in over the routeable interface to be able to use the unadorned name.) How do folks deal with this? Is this an unusual configuration? do people just live with addorned names like nodename-pub for their outward facing network interfaces? -Allie -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] switches definition not working
Are you saying to enable SNMP on the switch? If so, these switches use the default SNMP setup - v1 PUBLIC community is RO. If I disable the line that defines protocol and username/password in the switches table, discovery works fine. Also, wouldn't xdsh use telnet since I configured the protocol to be telnet in the switches table? When I remove the tn: in front of the username or reconfigure the switches table to use SSH, xdsh stops working on the switches. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Xiao Peng Wang w...@cn.ibm.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-09-16 03:13 Subject:Re: [xcat-user] switches definition not working xdsh against the switch is working with ssh connection and run commands. But discovery basically depends on the SNMP interface of the switch, so you must enable the snmp setup. Thanks Best Regards -- Wang Xiaopeng (王晓朋) IBM China System Technology Laboratory Tel: 86-10-82453455 Email: w...@cn.ibm.com Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Christian Caruthers ---2014/09/13 22:52:53---I've configured the switches tables for management as so: switch,,,tn:admin,admin,telnet From: Christian Caruthers christian.caruth...@us.ibm.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014/09/13 22:52 Subject: [xcat-user] switches definition not working I've configured the switches tables for management as so: switch,,,tn:admin,admin,telnet,BNT,, This works when I run xdsh g8052-sw1 --devicetype EthSwitch enable;show run, but when I try to discover a node in the cluster, I see the following in the syslog on the MN: Sep 13 10:30:15 m1 xCAT[24352]: Error communicating with g8052-sw1: Unknown user name I get the same error whether I have the sshusername defined or not. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce. Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce. Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] switches definition not working
Sorry, I forgot to include that: xcat-core-2.8.4 xcat-dep-201407170136 Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Lissa Valletta/Poughkeepsie/IBM@IBMUS To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-09-15 07:28 Subject:Re: [xcat-user] switches definition not working What level of xCAT are your running? Lissa K. Valletta 8-3/B10 Poughkeepsie, NY 12601 (tie 293) 433-3102 Christian Caruthers---09/13/2014 10:58:29 AM---I've configured the switches tables for management as so: switch,,,tn:admin,admin,telnet From: Christian Caruthers/Richmond/IBM@IBMUS To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 09/13/2014 10:58 AM Subject: [xcat-user] switches definition not working I've configured the switches tables for management as so: switch,,,tn:admin,admin,telnet,BNT,, This works when I run xdsh g8052-sw1 --devicetype EthSwitch enable;show run, but when I try to discover a node in the cluster, I see the following in the syslog on the MN: Sep 13 10:30:15 m1 xCAT[24352]: Error communicating with g8052-sw1: Unknown user name I get the same error whether I have the sshusername defined or not. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] switches definition not working
I've configured the switches tables for management as so: switch,,,tn:admin,admin,telnet,BNT,, This works when I run xdsh g8052-sw1 --devicetype EthSwitch enable;show run, but when I try to discover a node in the cluster, I see the following in the syslog on the MN: Sep 13 10:30:15 m1 xCAT[24352]: Error communicating with g8052-sw1: Unknown user name I get the same error whether I have the sshusername defined or not. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] genesis qlcnic module
We have a nx360m4 MN running RHEL 6.5. The MN has Mellanox 10Gb adaptors while the compute nodes have Q-Logic 10Gb Virtual Fabric Adaptor NICs which use the qlcnic module. We want to netboot over the 10Gb adaptor, and this module is not included in the genesis image by default. After much experimentation, we pulled the qlcnic.ko from /lib/modules/`uname -r`/kernel/drivers/net/qlcnic/ on a CentOS 6.5 VM into /opt/xcat/share/xcat/netboot/genesis/x86_64/fs/lib/modules/`uname -r`/kernel/drivers/net/qlcnic/ on the MN and ran depmod as outlined here: http://sourceforge.net/p/xcat/wiki/XCAT_iDataPlex_Advanced_Setup/#adding-drivers-to-the-genesis-boot-kernel This is the only way we could get it to work. If we pulled the module from the RHEL kernel tree, the node would PXE and come to a message module signed with unknown public key error. While I'm happy to have solved the problem, the solution is a kludge. Is there a more elagant solution I'm missing or is this just a matter of the qlcnic module needing to be included in future releases? Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] xCAT 2.8.4 and HP IPMI
We actually saw this intermittently on x240 flex blades last week. Fresh 2.8.4 install with latest deps. Don't have firmware info though. uEFI was B2E126EUS-1.31, IMM was 2.60 1AOO42Y. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Lanae Neild lne...@clemson.edu To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-07-29 11:20 Subject:[xcat-user] xCAT 2.8.4 and HP IPMI Does anyone else have a similar issue or know how to work around it? We have HP SL250s some with their IPMI firmware rev 1.20 some with 1.40. Confirmed problem with both. They have their ILO settings configured the same way, and xCAT rboot and rinstall worked before we upgraded from 2.7x to 2.8.4. Now we're getting this error, and have to manually reboot them, as it no longer works with xCAT: [root@master ~]# rboot node1903 node1903: boot node1903: Error: Invalid role node1903: Error: Invalid role node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Invalid role node1903: Error: Invalid role node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Invalid role node1903: Error: Invalid role node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Invalid role node1903: Error: Invalid role node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) node1903: Error: Invalid role node1903: Error: Invalid role node1903: Error: Insufficient resources to create new session (wait for existing sessions to timeout) Here is the node definition for this one: [root@master ~]# lsdef node1903 Object name: node1903 addkcmdline=vga=0x303 rdblacklist=nouveau,mlx4_ib,mlx4_en,mlx4_core nouveau.modeset=0 arch=x86_64 bmc=node1903-man0 chain=runcmd=bmcsetup,standby cons=ipmi currchain=boot currstate=boot groups=phase09,all,hp,compute,SL250,gpu,k20 initrd=xcat/SL/x86_64/initrd.img interface=eth2 kcmdline=quiet repo=http://10.125.40.6/install/SL/x86_64/ ks= http://10.125.40.6/install/autoinst/node1903 ksdevice=eth2 cmdline console=tty0 console=ttyS0,19200n8r kernel=xcat/SL/x86_64/vmlinuz mac=2c:44:fd:97:18:88 mgt=ipmi mpa=hpsl250chassis33 mtm=SL250 netboot=pxe nfsdir=/install nfsserver=10.125.40.6 ondiscover=nodediscover os=SL postbootscripts=otherpkgs,palmetto-ipmi,serialconsole postscripts=syslog,remoteshell,syncfiles,resyslog,palmetto-mountxcat,palmetto-yumpackages,palmetto-nfsmounts,palmetto-syncfiles,palmetto-services,palmetto-bios,palmetto-ethernet,palmetto-umountxcat,palmetto-puppet power=ipmi primarynic=eth2 profile=compute provmethod=SL-x86_64-install-compute rack=BM-27 room=ITC serial=USE341HCLF serialflow=hard serialport=0 serialspeed=19200 slotid=3 status=booted statustime=10-24-2013 20:29:25 supportedarchs=x86,x86_64 switch=h-itc-bm27-d4810-117 switchinterface=0/1 switchport=1 tftpserver=10.125.40.6 unit=8-11 xcatmaster=10.125.40.6 [root@master ~]# Lanae Neild Systems Programmer I HPC, CCIT, Clemson University (864)505-4293 lne...@clemson.edu -- Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future. http://pubads.g.doubleclick.net/gampad/clk?id=153845071iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] r-commands not working on flex systems
Installing Flex systems, and seeing the following error when running r-commands (rpower, reventlog, rvitals, etc.) [root@mgt ~]# rpower a1n01 stat a1n01: Error: Incorrect password provided I can ssh into the IMM for this node using the login and password found in the passwd table. The CMM appears to be working fine - rscan and rspconfig return no errors, and getmacs worked as well. Seeing this error on all of our Flex systems. Running xCAT 2.8.4 with latest deps package on rhel 6.4. [root@mgt ~]# lsdef a1n01 Object name: a1n01 arch=x86_64 bmc=a1n01-imm bmcpassword=PASSW0RD bmcusername=USERID chain=runcmd=bmcsetup,shell chassis=1 cons=ipmi getmac=blade groups=compute,flex,rack1,c1 hwtype=blade installnic=mac mac=6c:ae:8b:34:34:88 mgt=ipmi mpa=cmm01 mtm=8737AC1 netboot=xnba nfsserver=172.20.0.1 nodetype=mp ondiscover=nodediscover os=rhels6.4 postbootscripts=otherpkgs,hypercores postscripts=syslog,remoteshell,syncfiles,setupntp,hardeths,lm-ib,lm-pbs_mom power=impi profile=compute provmethod=netboot rack=A1 serial=KQ9AX3P serialflow=hard serialport=0 serialspeed=115200 slot=1 slotid=1 tftpserver=172.20.0.1 unit=3 [root@mgt ~]# lsdef cmm01 Object name: cmm01 chassis=1 groups=cmm,all hidden=0 hwtype=cmm id=0 ip=172.30.101.131 mgt=blade mpa=cmm01 mtm=8721HC1 nodetype=mp postbootscripts=otherpkgs postscripts=syslog,remoteshell,syncfiles rack=A1 serial=KQ9DN5T side=1 slot=0 switch=g8052-sw1 switchport=7 unit=3 I found this bug from awhile ago: http://sourceforge.net/p/xcat/bugs/3552/ , but I have not yet tried the work around suggested. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] r-commands not working on flex systems
Trying those commands resulted in different errors, but we ultimately figured out the problem. There was an extra, incorrect, username and password in the ipmi table. So r-commands now work, but we have another problem. I can still SSH into the IMMs, but pasu doesn't work: pasu a1n01 save a1n01.out a1n01: Unable to validate userid/password on IMM. a1n01: Please make sure input the correct userid/password with supervisor authority level. a1n01: *** asu exited with error code 1. [root@mgt asu]# ssh USERID@a1n01-imm Password: system users Account Login ID AccessPassword Expires --- -- 1 USERID Read/Write Password doesn't expire system users -1 -n USERID -a Read/Write Password doesn't expire -sauth HMAC-SHA -spriv CBC-DES -sacc Set -strap 0.0.0.0 Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Jarrod B Johnson/Raleigh/IBM@IBMUS To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-07-23 09:01 Subject:Re: [xcat-user] r-commands not working on flex systems Well, there are a couple of ways to do this, the shortest manual way would be to ssh to the CMM and: system users -T mm[p] -am enabled system users -T mm[p] -n USERID -ipmisnmpv3 enabled While there, I personally also like to at least: system accseccfg -pe 0 -T mm[p] To remove password expiry. 'lsslp --flexdiscover' is supposed to take care of this automatically, but if it were manually setup, that could explain things. Christian Caruthers---07/23/2014 06:59:05 AM---Installing Flex systems, and seeing the following error when running r-commands (rpower, reventlog, From: Christian Caruthers/Richmond/IBM@IBMUS To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 07/23/2014 06:59 AM Subject: [xcat-user] r-commands not working on flex systems Installing Flex systems, and seeing the following error when running r-commands (rpower, reventlog, rvitals, etc.) [root@mgt ~]# rpower a1n01 stat a1n01: Error: Incorrect password provided I can ssh into the IMM for this node using the login and password found in the passwd table. The CMM appears to be working fine - rscan and rspconfig return no errors, and getmacs worked as well. Seeing this error on all of our Flex systems. Running xCAT 2.8.4 with latest deps package on rhel 6.4. [root@mgt ~]# lsdef a1n01 Object name: a1n01 arch=x86_64 bmc=a1n01-imm bmcpassword=PASSW0RD bmcusername=USERID chain=runcmd=bmcsetup,shell chassis=1 cons=ipmi getmac=blade groups=compute,flex,rack1,c1 hwtype=blade installnic=mac mac=6c:ae:8b:34:34:88 mgt=ipmi mpa=cmm01 mtm=8737AC1 netboot=xnba nfsserver=172.20.0.1 nodetype=mp ondiscover=nodediscover os=rhels6.4 postbootscripts=otherpkgs,hypercores postscripts=syslog,remoteshell,syncfiles,setupntp,hardeths,lm-ib,lm-pbs_mom power=impi profile=compute provmethod=netboot rack=A1 serial=KQ9AX3P serialflow=hard serialport=0 serialspeed=115200 slot=1 slotid=1 tftpserver=172.20.0.1 unit=3 [root@mgt ~]# lsdef cmm01 Object name: cmm01 chassis=1 groups=cmm,all hidden=0 hwtype=cmm id=0 ip=172.30.101.131 mgt=blade mpa=cmm01 mtm=8721HC1 nodetype=mp postbootscripts=otherpkgs postscripts=syslog,remoteshell,syncfiles rack=A1 serial=KQ9DN5T side=1 slot=0 switch=g8052-sw1 switchport=7 unit=3 I found this bug from awhile ago: http://sourceforge.net/p/xcat/bugs/3552/ , but I have not yet tried the work around suggested. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] trying to install new 3650m4 eth0 link is not ready
Under /tftpboot/pxelinux.cfg there should be a file named for the node you're trying to install. This file contains the kickstart boot command that's passed to the system in response to its PXE request. Can you send the contents of that file? Also, does this node have 10Gb ports, or any additional PCI Ethernet cards in it? If so, Red Hat more than likely sees port 1 on this card as eth0 while the system BIOS (or uEFI or whatever) sees the planar port 1 as eth0. Clearing out installnic and prinic help get around this, Where you're install is failing, the network device Network Manager is trying to initialize is dictated by the ksdevice option in the file I mentioned above. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-07-17 11:51 Subject:Re: [xcat-user] trying to install new 3650m4 eth0 link is not ready I did the above nodech command and then did nodeset quser10 install and it still timed out with same message. damir On Thu, Jul 17, 2014 at 10:37 AM, Damir Krstic damir.krs...@gmail.com wrote: would this work: nodech quser10 noderes.installnic= ? On Thu, Jul 17, 2014 at 10:28 AM, Jarrod Johnson jarrod.b.john...@gmail.com wrote: What happens if you blank installnic? If not set it will autodetect and the result may surprise you. I recommend never setting installnic or primarynic on x86 anymore, since the autodetect works as desired 99.9% of the time. On Jul 17, 2014 10:05 AM, Damir Krstic damir.krs...@gmail.com wrote: we have 4 new login nodes that i am trying to deploy in next couple of days. they were autodiscovered (have mac in the mac table) and i have trying to installed them now: nodeset quser10 install the installation stops at the following: NetworkManager: eth0 link is not ready eth0 deactivating device (screenshot included) lsdef of the node itself: Object name: quser10 arch=x86_64 bmc=quser10-bmc bmcpassword=PASSW0RD bmcport=0 bmcusername=USERID currchain=boot currstate=install rhels6.2-x86_64-user6 groups=user6,user6-profile,ipmi,bnt103-user6,x3650m2,all initrd=xcat/rhels6.2/x86_64/initrd.img installnic=eth0 ip=172.20.4.10 kcmdline=nofb utf8 ks=http://172.20.0.1/install/autoinst/quser10 ksdevice=eth0 console=tty0 console=ttyS0,115200 noipv6 kernel=xcat/rhels6.2/x86_64/vmlinuz mac=40:f2:e9:ce:e2:8a mgt=ipmi mtm=7914AC1 netboot=pxe nfsserver=172.20.0.1 os=rhels6.2 postbootscripts=otherpkgs,setupntp postscripts=syslog,remoteshell,syncfiles,syslog-adminnodes,ssh,ifcfg-eth,fstab,passwd,statefull_tasks6,ipoib primarynic=eth0 profile=user6 provmethod=install serial=06ATFXT serialport=0 serialspeed=115200 status=configuring statustime=07-16-2014 14:29:10 supportedarchs=x86,x86_64 switch=bnt103 switchinterface=eth0 switchport=1 switchvlan=1 tftpserver=172.20.0.1 xcatmaster=172.20.0.1 xcat version: [root@mgt rh]# xcatconfig --version Version 2.7.3 (svn r13117, built Mon Jun 18 05:12:28 EDT 2012) We will be deploying a new management node with updated xCAT as soon as the login nodes are provisioned. Thanks in advance for your help. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Intel RMM4 setup?
Latest core and deps packages. The ipmitool command you sent appears to work on both systems, but rcons still returns the error I originally sent. After running makeconservercf, I don't see an ipmitool-xcat process for these RMM-based systems - consoleondemand is not enabled. They use the same cons and management settings as the dx360 nodes in this cluster. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Er Tao Zhao erta...@cn.ibm.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-07-09 02:39 Subject:Re: [xcat-user] Intel RMM4 setup? What is the version of xCAT you are using? Will you pls try the following command to check whether the console has any output? ipmitool-xcat -I lanplus -U userid -P password -H RMM IP sol activate Best Regards, --- Zhao Er Tao IBM China System and Technology Laboratory, Beijing Tel:(86-10)82450485 Email: erta...@cn.ibm.com Address: 1/F, 28 Building,ZhongGuanCun Software Park, No.8 DongBeiWang West Road, Haidian District, Beijing, 100193, P.R.China Christian Caruthers ---2014/07/09 05:26:24---Christian Caruthers christian.caruth...@us.ibm.com Christian Caruthers christian.caruth...@us.ibm.com 2014/07/09 05:28 Please respond to xCAT Users Mailing list xcat-user@lists.sourceforge.net To xcat-user@lists.sourceforge.net, cc Subject [xcat-user] Intel RMM4 setup? Trying to help a customer set up a system using Intel RMM4 system management. They've gone through the Intel Deployment Assistant which configured the RMM IP and supposedly set up SOL. So far rpower, rbeacon, reventlog rvitals all work; rinv hangs and rcons returns: Acquiring startup lock...done Error: Unable to establish IPMI v2 / RMCP+ session Error: No response de-activating SOL payload I cannot telnet into the RMM, only SSH. Anyone have any experience with this, or are they spinning their wheels trying to get SOL to work? The RMM documentation says it supports IPMI 2.0 and includes the following: The BMC supports IPMI 2.0 SOL. IPMI 2.0 introduced a standard serial-over-LAN feature. This is implemented as a standard payload type (01h)over RMCP+. Three commands are implemented for SOL 2.0 configuration. “Get SOL 2.0 Configuration Parameters” and “Set SOL 2.0 Configuration Parameters”: These commands are used to get and set the values of the SOL configuration parameters. The parameters are implemented on a per-channel basis. “Activating SOL”: This command is not accepted by the BMC. It is sent by the BMC when SOL is activated to notify a remote client of the switch to SOL. Activating a SOL session requires an existing IPMI-over-LAN session. If encryption is used, it should be negotiated when the IPMI-over LAN session is established. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user [attachment pic30458.gif deleted by Christian Caruthers/Richmond/IBM] -- Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Intel RMM4 setup? SOLVED (For now)
Jarrod has offered a solution. Modify the end of /opt/xcat/share/xcat/ipmi like so: #my $inteloption=; #if ($isintel) { # $inteloption= -o intelplus; # $inteloption=; #} if ($iface eq lanplus) { system $ipmitool -I lanplus -U $username -P '$password' -H $bmc $solcom deactivate; #Stop any active session } exec $ipmitool -I $iface -U $username -P '$password' -H $bmc $solcom activate; Basically, remove/comment out any reference to $inteloption. I copied the original ipmi file to ipmi.rmm and changed nodehm.cons=ipmi.rmm for the RMM nodes, ran makeconservercf and that got console working. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Christian Caruthers/Richmond/IBM@IBMUS To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2014-07-09 11:47 Subject:Re: [xcat-user] Intel RMM4 setup? Latest core and deps packages. The ipmitool command you sent appears to work on both systems, but rcons still returns the error I originally sent. After running makeconservercf, I don't see an ipmitool-xcat process for these RMM-based systems - consoleondemand is not enabled. They use the same cons and management settings as the dx360 nodes in this cluster. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From:Er Tao Zhao erta...@cn.ibm.com To:xCAT Users Mailing list xcat-user@lists.sourceforge.net Date:2014-07-09 02:39 Subject:Re: [xcat-user] Intel RMM4 setup? What is the version of xCAT you are using? Will you pls try the following command to check whether the console has any output? ipmitool-xcat -I lanplus -U userid -P password -H RMM IP sol activate Best Regards, --- Zhao Er Tao IBM China System and Technology Laboratory, Beijing Tel:(86-10)82450485 Email: erta...@cn.ibm.com Address: 1/F, 28 Building,ZhongGuanCun Software Park, No.8 DongBeiWang West Road, Haidian District, Beijing, 100193, P.R.China Christian Caruthers ---2014/07/09 05:26:24---Christian Caruthers christian.caruth...@us.ibm.com Christian Caruthers christian.caruth...@us.ibm.com 2014/07/09 05:28 Please respond to xCAT Users Mailing list xcat-user@lists.sourceforge.net To xcat-user@lists.sourceforge.net, cc Subject [xcat-user] Intel RMM4 setup? Trying to help a customer set up a system using Intel RMM4 system management. They've gone through the Intel Deployment Assistant which configured the RMM IP and supposedly set up SOL. So far rpower, rbeacon, reventlog rvitals all work; rinv hangs and rcons returns: Acquiring startup lock...done Error: Unable to establish IPMI v2 / RMCP+ session Error: No response de-activating SOL payload I cannot telnet into the RMM, only SSH. Anyone have any experience with this, or are they spinning their wheels trying to get SOL to work? The RMM documentation says it supports IPMI 2.0 and includes the following: The BMC supports IPMI 2.0 SOL. IPMI 2.0 introduced a standard serial-over-LAN feature. This is implemented as a standard payload type (01h)over RMCP+. Three commands are implemented for SOL 2.0 configuration. “Get SOL 2.0 Configuration Parameters” and “Set SOL 2.0 Configuration Parameters”: These commands are used to get and set the values of the SOL configuration parameters. The parameters are implemented on a per-channel basis. “Activating SOL”: This command is not accepted by the BMC. It is sent by the BMC when SOL is activated to notify a remote client of the switch to SOL. Activating a SOL session requires an existing IPMI-over-LAN session. If encryption is used, it should be negotiated when the IPMI-over LAN session is established. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user [attachment pic30458.gif deleted by Christian Caruthers/Richmond/IBM] -- Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft ___ xCAT-user mailing list xCAT-user
[xcat-user] Duplicate entries in networks table - bug?
Saw this after fat-fingering something in the networks table: tabdump networks #netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,comments,disable mgt,172.20.1.0,255.255.255.0,eth3,,, mgt,172.24.1.0,255.255.255.0,bond0,,, cmp,172.16.1.0,255.255.255.0,eth3,,,172.16.1.201-172.16.1.230 The network is different, but the netname is the same. When I wrote the table out this way, it did not thrown an error. Aren't the netnames suposed to be unique? xcat-core-2.8.4 xcat-dep-201407010254 Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
[xcat-user] makedns doesn't create short name
Ran makedns -n, don't see any errors. When it's finished, I see: host n01 Host n01 not found: 3(NXDOMAIN) host n01.cluster.net n01.cluster.net has address 172.20.0.11 Running 2.8.3 w/ latest deps. Object name: hpc-prov dhcpserver=172.20.0.1 dynamicrange=172.20.0.100-172.20.0.254 mask=255.255.255.0 mgtifname=eth2 nameservers=172.20.0.1 net=172.20.0.0 ntpservers=172.20.0.1 tftpserver=172.20.0.1 172.20.0.1 is the xCAT MN. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn-- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] makedns doesn't create short name
Found a typo in resolv.conf. That was it. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Dmitry Yulov dyu...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net, Date: 04/17/2014 12:05 AM Subject:Re: [xcat-user] makedns doesn't create short name Hello, Please look at /etc/resolv.conf and check strings search YOU DOMAIN nameserver 172.20.0.1 Also, please check host n01.you domain should work. Best Regards, -- Dmitry. On 17 April 2014 07:23, Christian Caruthers christian.caruth...@us.ibm.com wrote: Ran makedns -n, don't see any errors. When it's finished, I see: host n01 Host n01 not found: 3(NXDOMAIN) host n01.cluster.net n01.cluster.net has address 172.20.0.11 Running 2.8.3 w/ latest deps. Object name: hpc-prov dhcpserver=172.20.0.1 dynamicrange=172.20.0.100-172.20.0.254 mask=255.255.255.0 mgtifname=eth2 nameservers=172.20.0.1 net=172.20.0.0 ntpservers=172.20.0.1 tftpserver=172.20.0.1 172.20.0.1 is the xCAT MN. Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user inline: graycol.gif-- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Strange xCAT stateless image behavior
Thanks for catching that! We had used the compute.pkglist which did not include dracut. We switched over to using the compute.rhels6.x86_64.pkglist that includes the dracut packages, but we still got the same error. I tried a second time with rdshell added to the kernel line in the node's elilo file, and saw this: ata2.00: SATA link down (SStatus 0 SControl 300)\ ata1.00: SATA link down (SStatus 0 SControl 300) ata1.01: SATA link down (SStatus 0 SControl 300) ata2.01: SATA link down (SStatus 0 SControl 300) sd 0:2:0:0: [sda] 64453103616 512-byte logical blocks: (32.9 TB/30.0 TiB) sd 0:2:0:0: [sda] 4096-byte physical blocks sd 0:2:0:0: [sda] Write Protect is off sd 0:2:0:0: [sda] Write cache: enabled, read cache: disabled, doesn't support DPO or FUA sda: unknown partition table sd 0:2:0:0: [sda] Attached SCSI disk dracut Warning: No root device 1 found Dropping to debug shell. sh: cannot set terminal process group (-1): Inappropriate ioctl for device sh: no job control in this shell dracut:/# Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn From: Jonathan Mills jonmi...@renci.org To: xCAT Users Mailing list xcat-user@lists.sourceforge.net, Date: 03/20/2014 09:12 AM Subject:Re: [xcat-user] Strange xCAT stateless image behavior Make sure you have the following RPMs installed inside the stateless image before running packimage: dracut-kernel-004-303.el6.noarch dracut-network-004-303.el6.noarch dracut-004-303.el6.noarch Your version numbers may differ. These are from CentOS 6.4. On 3/20/14, 8:51 AM, Christian Caruthers wrote: Trying to set up the default compute stateless template in v2.8.3 on rhel 6.4. This system was first upgraded from xCAT 2.7.3 to 2.8.3 w/ latest deps. We took the default compute.pkglist and compute.exlist from $XCATROOT/share/xcat/netboot/rh. When genimage is building the initrd, I see the following: Try to load drivers: pps_core mlx4_core mdio libcrc32c ptp dca virtio_ring virtio mlx4_en tg3 bnx2 bnx2x e1000 e1000e igb mlx_en virtio_net be2net ext3 ext4 to initrd. W: Cannot load dracut module xcat, dependencies failed. W: Dracut module xcat cannot be found. W: Dracut module nfs cannot be found. W: Dracut module network cannot be found. the initial ramdisk for statelite is generated successfully. Try to load drivers: pps_core mlx4_core mdio libcrc32c ptp dca virtio_ring virtio mlx4_en tg3 bnx2 bnx2x e1000 e1000e igb mlx_en virtio_net be2net ext3 ext4 to initrd. W: Cannot load dracut module xcat, dependencies failed. W: Dracut module xcat cannot be found. W: Dracut module nfs cannot be found. W: Dracut module network cannot be found. the initial ramdisk for stateless is generated successfully. After running packimage, we boot the system and see the following in the console: Initalizing network drop monitor service Freeing unused kernel memory: 1264k freed Write protecting the kernel read-only data: 10240k Freeing unused kernel memory: 904k freed Freeing unused kernel memory: 1676k freed dracut: FATAL: No or empty root= argument dracut: Refusing to continue dracut Warning: Signal caught! /lib/dracut-lib.sh: line 83: /emergency/01-die.sh: No such file or directory Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: init Not tainted 2.6.32-358.el6.x86_64 #1 Call Trace: ? panic+0xa7/0x16f ? do_exit+0x862/0x870 ? do_group_exit+0x58/0xd0 ? sys_exit_group+0x17/0x20 ? system_call_fastpath+0x16/0x1b The management node has a patched kernel, and it appears the stateless image is running the stock rhel 6.4 kernel. Could this be the problem, or am I missing something: [root@mgt01 modules]# ls /install/netboot/rhels6.4/x86_64/compute/rootimg/lib/modules *2.6.32-358.el6.x86_64* [root@mgt01 modules]# uname -a Linux mgt01.cluster.net *2.6.32-358.18.1.el6.x86_64* #1 SMP Fri Aug 2 17:04:38 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux [root@mgt01 modules]# Regards,* Christian Caruthers* Senior Consultant - System x Linux HPC Mobile: 757-289-9872 _Find me on LinkedIn_ http://www.linkedin.com/profile/view?id=14378571trk=tab_pro -- Jonathan Mills Systems Administrator Renaissance Computing Institute UNC-Chapel Hill -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user inline: graycol.gif-- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases
Re: [xcat-user] xCAT dhcpd overwrites ifcfg-eth* files?
Rod, You need to make sure your ifcfg-eth* files have BOOTPROTO=static before you restart the interface. Also make sure you have the NetworkManager package uninstalled and/or NM_CONTROLLED=no in the ifcfg-eth files. Regards, Christian Caruthers Senior Consultant System x Linux HPC Mobile: 757-289-9872 Sent from Lotus Traveler Lissa Valletta --- Re: [xcat-user] xCAT dhcpd overwrites ifcfg-eth* files? --- |--+--| |Fr|Lissa Valletta lis...@us.ibm.com | |om| | |: | | |--+--| |To|xCAT Users Mailing list xcat-user@lists.sourceforge.net | |--+--| |Cc|xcat-user@lists.sourceforge.net | |--+--| |Da|Mon, Feb 24, 2014 12:31 | |te| | |: | | |--+--| |Su|Re: [xcat-user] xCAT dhcpd overwrites ifcfg-eth* files? | |bj| | |ec| | |t | | |--+--| Why are you using such an old level of xCAT? You should be using the current 2.8 release - 2.8.3. Even using 2.7 you should be at 2.7.8 by now. Lissa K. Valletta 8-3/B10 Poughkeepsie, NY 12601 (tie 293) 433-3102 Inactive hide details for Engdahl, Rod ---02/24/2014 12:16:07 PM---I have an issue with a relatively new xCAT-maintained clusEngdahl, Rod ---02/24/2014 12:16:07 PM---I have an issue with a relatively new xCAT-maintained cluster of redhat nodes. We use xCAT and the a From:Engdahl, Rod engd...@visa.com To:xcat-user@lists.sourceforge.net xcat-user@lists.sourceforge.net, Date:02/24/2014 12:16 PM Subject:[xcat-user] xCAT dhcpd overwrites ifcfg-eth* files? I have an issue with a relatively new xCAT-maintained cluster of redhat nodes. We use xCAT and the associated DHCP functionality for system provisioning and upgrades, but all of our nodes normally have static IP addresses. At long intervals, we are finding that the ifcfg-eth* are being overwritten, causing us to go in and add IPADDR and NETMASK lines to ifcfg-eth* files that no longer have them, and restart the network service. At the same time, for example, ifcfg-eth*.xcat files are created. I suspect that dhcp is doing this, but would like to confirm before I try to remediate. RHEL 6.4 xCAT: Version 2.7.6 (svn r14451, built Tue Nov 27 21:57:27 EST 2012) -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] Embedding GPFS 3.5 into xcat diskless boot images
That's a good point, Lisa. A way around that is to not use the MN in rc.local, but rather the GPFS primary or secondary server. You can even get clever and use a ping test to make sure one or the other is online before running mmsdrrestore.I forgot the security side of not allowing SSH from the compute nodes to the MN.Regards, Christian Caruthers Senior Consultant - System x Linux HPC Mobile: 757-289-9872 Find me on LinkedIn -Lissa Valletta/Poughkeepsie/IBM@IBMUS wrote: -To: xCAT Users Mailing list xcat-user@lists.sourceforge.netFrom: Lissa Valletta/Poughkeepsie/IBM@IBMUSDate: 01/23/2014 12:29PMCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] Embedding GPFS 3.5 into xcat diskless boot images By default, xCAT does not allow ssh from the compute nodes ( passwordless) to the MN. It only sets up passwordless ssh from the MN to the compute nodes. The attribute sshbetweennodes set to "NOGROUPS" actually is used to not allow passwordless ssh between the compute nodes. But the man page is a little vague. I will improve it. Lissa K. Valletta 8-3/B10 Poughkeepsie, NY 12601 (tie 293) 433-3102Christopher Samuel ---01/22/2014 09:53:00 PMBEGIN PGP SIGNED MESSAGE- Hash: SHA1 From:Christopher Samuel sam...@unimelb.edu.au To:xcat-user@lists.sourceforge.net, Date:01/22/2014 09:53 PM Subject:Re: [xcat-user] Embedding GPFS 3.5 into xcat diskless boot images-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 23/01/14 06:26, Christian Caruthers wrote: In the past, with stateless nodes, I install the rpms during image build and set rc.local to run mmsdrrestore on the stateless node to sync it with the cluster. We use statelite and set this in litefile: "ALL","/var/mmfs/","persistent",, so their state persists across reboots. Handy as we don't permit SSH back from compute/login nodes to the management node as root (sshbetweennodes set to "NOGROUPS"). cheers, Chris - -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlLgg3kACgkQO2KABBYQAh9EsQCfRGBUkJYxmIk253TL2QVnVykL QM8An2EkPwd/MF5GjB28U8iV9Fxw10/R =B7Pt -END PGP SIGNATURE- -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user--CenturyLink Cloud: The Leader in Enterprise Cloud Services.Learn Why More Businesses Are Choosing CenturyLink Cloud ForCritical Workloads, Development Environments Everything In Between.Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-userinline: Image.1__=0ABBF6FADFCC197A8f9e8a93df938@us.ibm.com.gif-- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user