Re: [xcat-user] [External] xCAT forcibly disabling SELinux and firewalld

2019-10-14 Thread Jarrod Johnson
The differences are mild at this point.  Generally speaking:
Lenovo has a test cycle.  When we enter a test cycle, we start with the last 
‘released’ version and backport or create fixes against that for release.

It happened because for a time there we were just going with the latest release 
and we got blindsided by some issues so we stopped adopting new xcat.org 
releases mid test cycle, and just take what was there when we start and cherry 
pick to close the gap.

I don’t know if it’s still the case, but one major difference was that we took 
our Genesis base to CentOS7 and xcat.org was at CentOS6.  I don’t know if 
that’s still a difference.

Another difference is we don’t have ‘snap’ in our version numbers, if it is 
tagged then it is a simple version and development builds incorporate count of 
commits since last release and part of a git commit.

So in short, the difference practically speaking is pretty much a slightly 
different cadence for releases.  The concept of a different meta-package is a 
bit more than I would like to deviate.

In terms of xCAT ‘3.0’ I can’t commit to whether it will be called as such, but 
it at least intends to cover at least a large chunk of the capability ‘real 
soon now’.

Currently it can do:
-Discovery based on switch or serial number/spreadsheet or blade-topology in 
ThinkSystem D2
-Hardware management (power on/off eventlogs, firmware updates, raid config, 
bios settings, etc) using redfish or ipmi
-Text console management and logging
-Web and CLI access
-Some quick shorthands that aren’t too special (nodeshell is psh but with some 
expressions, noderun is the same but for running commands locally rather than 
on the target nodes, nodersync, etc)
-‘Service node’ equivalent behavior through collective (no external DB or other 
prereqs required at the moment, a clustered filesystem may start being needed 
for OS install repository when that time comes)

What it almost certainly will do:
-Essential deployment bootstrapping (networking and ssh or similar, with fewer 
deviations from a default configuration)
-OS install of Linux distributions to disk (CentOS7+8, SuSE 15, Ubuntu 18.04+) 
from pxe or secure methods (with a filesystem-based ‘osimage’ repository rather 
than a database-oriented one)
-Boot customized OS payloads (pass through kernel/initrd with some basic 
parameterization)
-Update DNS records through DDNS as requested

What it shouldn’t have to do anymore:
-Update dhcp configuration (integrated DHCP packet handling, better DHCP 
interoperability, more front and center support for static without DHCP)

Things that are not currently prioritized:
-Virtual machine management: Perhaps some for test, but we have not received a 
lot of feedback on interest in our take on VM management for day to day 
operations
-Stateless (well, this one is a long story, but essentially if desired we’d 
probably do the most straightforward usage of the current genimage combined 
with the ‘customized’ payloads.  In short we don’t have any particularly useful 
ideas beyond the current wrapping of yum/zypper/debootstrap).  Also we may put 
more effort into ostree type approaches as an alternative with more outside 
backing than xCAT stateless.
-Open ended postscript framework: Here the thought would be to see whether 
recommending salt or ansible and enabling that would get the same value with 
more alignment with the current internet consensus.  This would have us making 
a bit more ‘integrated’ support for networking and remote shell, perhaps some 
other select content, and delegating the open ended portions to other scripts.

Overall a lot of focus on having the liberty to change things to address some 
annoyances, such as:
-There are currently several different candidate places to indicate the 
credentials for devices, some group compatible some not.  In confluent 
everything is a node/group attribute and no more ‘site’ or ‘passwd’ tables (an 
‘everything’ automatic group is provided as a conceptually compatible stand-in 
for the concepts)
-OS image content split between filesystem and database. We want to fix this by 
having the filesystem be everything and no DB indexing required
-Tighter and more explicit security behaviors.  No more loop mounting, more 
ability to run as non-root, narrowing to a few opt-in controls for security 
judgement calls, tighter link between some security related things during 
deployment, support for no-local-password deployment, context-aware blocking of 
‘self’ calls from nodes
-Pain of SSL over localhost: confluent uses unix domain sockets and adds 
support for system passwords for remote socket and web ui access instead of TLS 
client certs, though practically no one would use the remote CLI support 
instead of the collective mode.

Many of those little annoyances are wallpapered over by xcatconfig and rpm 
install, but it would be better to have the behaviors be ‘natural’ and not 
require too much be done automatically to make up for the 

Re: [xcat-user] [External] xCAT forcibly disabling SELinux and firewalld

2019-10-14 Thread Vinícius Ferrão via xCAT-user
Thanks Jarrod.

Opened the issue: https://github.com/xcat2/xcat-core/issues/6445

Just for the sake of completude: what’s the difference between the upstream and 
the Lenovo build? Theres nothing explaining on 
hpc.lenovo.com.

It appears to be tight with Confluent. I heard that Confluent would eventually 
replace xcatd and become the xCAT 3.0 release. Is this still true?

Thanks.

On 14 Oct 2019, at 09:38, Jarrod Johnson 
mailto:jjohns...@lenovo.com>> wrote:

I think it is fine, but on the other hand, I can only personally provide such a 
meta package in the lenovo branches.  I could open a pull request but I can't 
guarantee that it would be accepted.



From: Vinícius Ferrão 
mailto:fer...@versatushpc.com.br>>
Sent: Saturday, October 12, 2019 11:13 PM
To: Jarrod Johnson
Cc: xCAT Users Mailing list
Subject: Re: [External] [xcat-user] xCAT forcibly disabling SELinux and 
firewalld

Jarrod, do you think it’s okay to raise an issue on 
https://github.com/xcat2/xcat-core/issues to request this new meta package?
[https://avatars3.githubusercontent.com/u/10124414?s=400=4]
Issues · xcat2/xcat-core · GitHub
github.com
Code repo for xCAT core packages. Contribute to xcat2/xcat-core development by 
creating an account on GitHub.

Thanks,

On 26 Sep 2019, at 03:54, Jarrod Johnson 
mailto:jjohns...@lenovo.com>> wrote:

I've been considering removing all of that from executing on rpm install (also 
enabling services to start on boot just by installing rpm)

It was added for convenience of not asking to run a setup after install but it 
is inconsistent with general rpm behavior and limits ability to use flags to 
customize behavior.

On the flip side, this would be a change that people would have to learn and 
would surprise new installs.

I might make variant of the xCAT meta package with no auto setup so that people 
won't be surprised unless they opt into the other package.

Looking for thoughts.

For wider information, it doesn't yet have os deployment, but confluent has 
been developing and designing specifically with firewall and selinux in mind, 
as well as trying to mitigate the initial setup complexity that drove us to 
create xcatconfig in the first place.  For example no more tls certs required 
for local access and os import will no longer loop mount isos (one of the 
biggest selinux problems) and avoid rewriting other service etc files in daemon 
context.  More straightforward network usage and a documented set of firewalld 
commands.

From: Vinícius Ferrão via xCAT-user 
mailto:xcat-user@lists.sourceforge.net>>
Sent: Thursday, September 26, 2019 2:27:10 AM
To: xCAT Users Mailing list
Cc: Vinícius Ferrão
Subject: [External] [xcat-user] xCAT forcibly disabling SELinux and firewalld

Hello,

When installing xCAT in EL7 with yum install xCAT it’s just put SELinux in 
permissive mode and disables firewalld.

It does not even ask about it. It just does.

[root@headnode ~]# getenforce
Permissive
[root@headnode ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor 
preset: enabled)
   Active: inactive (dead)
 Docs: man:firewalld(1)

Sep 26 02:55:55 
headnode.cluster.iq.ufrj.br systemd[1]: 
Starting firewalld - dynamic firewall daemon...
Sep 26 02:55:56 
headnode.cluster.iq.ufrj.br systemd[1]: 
Started firewalld - dynamic firewall daemon.
Sep 26 03:09:18 
headnode.cluster.iq.ufrj.br systemd[1]: 
Stopping firewalld - dynamic firewall daemon...
Sep 26 03:09:21 
headnode.cluster.iq.ufrj.br systemd[1]: 
Stopped firewalld - dynamic firewall daemon.

There’s a way to avoid this behaviour?

Thanks,

PS: I’m aware of the consequences of firewalld and SELinux in xCAT environments.
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Thomas HUMMEL

On 10/14/19 5:37 PM, Vinícius Ferrão via xCAT-user wrote:

Thomas,

Do you have all the entries in /etc/hosts and on the DNS?


Actually I haven't any node entry in /etc/hosts indeed.


They are redundant, I know, but sometimes xCAT picks values from /etc/hosts and 
sometimes from DNS. This is really a problem, but you get used to it.


Ok.


If you’re unable to create the entries with makehosts, since you said that the 
machine isn’t an xCAT object, you can put the entries manually. xCAT will not 
override.

I assumed this with your feedback, not sure if it’s the problem either.


Thanks for your advice but I think I prefer using short name or IP 
address since it seems to workaround the issue.


--
TH.




___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Thomas HUMMEL

On 10/14/19 3:56 PM, Casandra H Qiu wrote:

can u able to run "nslookup maestro-xcat.maestro.pasteur.fr" ?

the "getipaddr' sub routine will failed if can't performs a lookup on 
that name


Hello,

yes I can resolve the name.

Thanks.

--
TH.




___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


[xcat-user] Automatische Antwort: xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Heckes Frank (CI/OSB4) via xCAT-user
Hello,

many thanks for your e-mail.  I'm out of office on October 14th - 18th and back 
in the
office at October 21th.

Best regards
-Frank Heckes
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Vinícius Ferrão via xCAT-user
Thomas, 

Do you have all the entries in /etc/hosts and on the DNS? They are redundant, I 
know, but sometimes xCAT picks values from /etc/hosts and sometimes from DNS. 
This is really a problem, but you get used to it.

If you’re unable to create the entries with makehosts, since you said that the 
machine isn’t an xCAT object, you can put the entries manually. xCAT will not 
override.

I assumed this with your feedback, not sure if it’s the problem either.

Thanks,


> On 14 Oct 2019, at 07:14, Thomas HUMMEL  wrote:
> 
> On 10/14/19 11:22 AM, Thomas HUMMEL wrote:
>> On 10/14/19 10:59 AM, Thomas HUMMEL wrote:
>>> following service nodes: ,maestro-xcat.maestro.pasteur.fr
>> Sorry, I juste noticed I had the above typo in my site table :
>> "master",",maestro-xcat.maestro.pasteur.fr",,
>> Now I changed it to
>> "master","maestro-xcat.maestro.pasteur.fr",,
>> I get
>> # xdcp maestro-300 -F /opt/test/synclists/list.synclist
>> Error: [maestro-xcat]: Error from pping
>> But I can pping the node :
>> [root@maestro-xcat opt]# pping maestro-300
>> maestro-300: ping
> 
> I can fix it now but can't quite explain what happens :
> 
> The problem was that pping was not able to pping the master 
> (maestro-xcat.maestro.pasteur.fr) itself.
> 
> I can make it work by
> 
> - either using the ip as the "master" attribute value in the site table
> - or using the non fqdn (maestro-xcat) value in site table
> 
> This is quite confusing as man site mention "The hostname of the xCAT 
> management node, as known by the nodes"
> 
> Note : in any case my maestro-xcat management node is not an xCAT node object 
> itself, which may itself be a bad practice (but it has always worked for me 
> this way)
> 
> Thanks
> 
> --
> TH
> 
> 
> 
> ___
> xCAT-user mailing list
> xCAT-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xcat-user


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Casandra H Qiu

can u able to run "nslookup maestro-xcat.maestro.pasteur.fr"  ?

the "getipaddr' sub routine will failed if can't performs a lookup on that
name


Thanks,
Casandra Qiu

...
Casandra Hong Qiu
Phone: (845) 433-9291, t/l 293-9291
Office: Building 8, 3-B-04
cxh...@us.ibm.com





From:   Thomas HUMMEL 
To: xcat-user@lists.sourceforge.net
Date:   10/14/2019 06:14 AM
Subject:[EXTERNAL] Re: [xcat-user] xdcp -F / updatenode -F "Noderange
missing"



On 10/14/19 11:22 AM, Thomas HUMMEL wrote:
> On 10/14/19 10:59 AM, Thomas HUMMEL wrote:
>
>> following service nodes: ,maestro-xcat.maestro.pasteur.fr
>
> Sorry, I juste noticed I had the above typo in my site table :
>
>
> "master",",maestro-xcat.maestro.pasteur.fr",,
>
> Now I changed it to
>
> "master","maestro-xcat.maestro.pasteur.fr",,
>
> I get
>
> # xdcp maestro-300 -F /opt/test/synclists/list.synclist
> Error: [maestro-xcat]: Error from pping
>
> But I can pping the node :
>
> [root@maestro-xcat opt]# pping maestro-300
> maestro-300: ping

I can fix it now but can't quite explain what happens :

The problem was that pping was not able to pping the master
(maestro-xcat.maestro.pasteur.fr) itself.

I can make it work by

- either using the ip as the "master" attribute value in the site table
- or using the non fqdn (maestro-xcat) value in site table

This is quite confusing as man site mention "The hostname of the xCAT
management node, as known by the nodes"

Note : in any case my maestro-xcat management node is not an xCAT node
object itself, which may itself be a bad practice (but it has always
worked for me this way)

Thanks

--
TH



___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=n1LR_Py9TQX0dVqfGTbLHUMGx25-C8VtBDS0nCzyNXY=5RVUn1qIBdQmkCvyEa3t1OCSXIFDifVCh9odApOo1-Q=E11Ukagk9-go-3iujk-hZO2DodCB6JcELXETKLRqTpQ=




___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] [External] xCAT forcibly disabling SELinux and firewalld

2019-10-14 Thread Jarrod Johnson
I think it is fine, but on the other hand, I can only personally provide such a 
meta package in the lenovo branches.  I could open a pull request but I can't 
guarantee that it would be accepted.



From: Vinícius Ferrão 
Sent: Saturday, October 12, 2019 11:13 PM
To: Jarrod Johnson
Cc: xCAT Users Mailing list
Subject: Re: [External] [xcat-user] xCAT forcibly disabling SELinux and 
firewalld

Jarrod, do you think it’s okay to raise an issue on 
https://github.com/xcat2/xcat-core/issues to request this new meta package?
[https://avatars3.githubusercontent.com/u/10124414?s=400=4]

Issues · xcat2/xcat-core · GitHub
github.com
Code repo for xCAT core packages. Contribute to xcat2/xcat-core development by 
creating an account on GitHub.


Thanks,

On 26 Sep 2019, at 03:54, Jarrod Johnson 
mailto:jjohns...@lenovo.com>> wrote:

I've been considering removing all of that from executing on rpm install (also 
enabling services to start on boot just by installing rpm)

It was added for convenience of not asking to run a setup after install but it 
is inconsistent with general rpm behavior and limits ability to use flags to 
customize behavior.

On the flip side, this would be a change that people would have to learn and 
would surprise new installs.

I might make variant of the xCAT meta package with no auto setup so that people 
won't be surprised unless they opt into the other package.

Looking for thoughts.

For wider information, it doesn't yet have os deployment, but confluent has 
been developing and designing specifically with firewall and selinux in mind, 
as well as trying to mitigate the initial setup complexity that drove us to 
create xcatconfig in the first place.  For example no more tls certs required 
for local access and os import will no longer loop mount isos (one of the 
biggest selinux problems) and avoid rewriting other service etc files in daemon 
context.  More straightforward network usage and a documented set of firewalld 
commands.

From: Vinícius Ferrão via xCAT-user 
mailto:xcat-user@lists.sourceforge.net>>
Sent: Thursday, September 26, 2019 2:27:10 AM
To: xCAT Users Mailing list
Cc: Vinícius Ferrão
Subject: [External] [xcat-user] xCAT forcibly disabling SELinux and firewalld

Hello,

When installing xCAT in EL7 with yum install xCAT it’s just put SELinux in 
permissive mode and disables firewalld.

It does not even ask about it. It just does.

[root@headnode ~]# getenforce
Permissive
[root@headnode ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor 
preset: enabled)
   Active: inactive (dead)
 Docs: man:firewalld(1)

Sep 26 02:55:55 headnode.cluster.iq.ufrj.br 
systemd[1]: Starting firewalld - dynamic firewall daemon...
Sep 26 02:55:56 headnode.cluster.iq.ufrj.br 
systemd[1]: Started firewalld - dynamic firewall daemon.
Sep 26 03:09:18 headnode.cluster.iq.ufrj.br 
systemd[1]: Stopping firewalld - dynamic firewall daemon...
Sep 26 03:09:21 headnode.cluster.iq.ufrj.br 
systemd[1]: Stopped firewalld - dynamic firewall daemon.

There’s a way to avoid this behaviour?

Thanks,

PS: I’m aware of the consequences of firewalld and SELinux in xCAT environments.
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Thomas HUMMEL

On 10/14/19 11:22 AM, Thomas HUMMEL wrote:

On 10/14/19 10:59 AM, Thomas HUMMEL wrote:


following service nodes: ,maestro-xcat.maestro.pasteur.fr


Sorry, I juste noticed I had the above typo in my site table :


"master",",maestro-xcat.maestro.pasteur.fr",,

Now I changed it to

"master","maestro-xcat.maestro.pasteur.fr",,

I get

# xdcp maestro-300 -F /opt/test/synclists/list.synclist
Error: [maestro-xcat]: Error from pping

But I can pping the node :

[root@maestro-xcat opt]# pping maestro-300
maestro-300: ping


I can fix it now but can't quite explain what happens :

The problem was that pping was not able to pping the master 
(maestro-xcat.maestro.pasteur.fr) itself.


I can make it work by

- either using the ip as the "master" attribute value in the site table
- or using the non fqdn (maestro-xcat) value in site table

This is quite confusing as man site mention "The hostname of the xCAT 
management node, as known by the nodes"


Note : in any case my maestro-xcat management node is not an xCAT node 
object itself, which may itself be a bad practice (but it has always 
worked for me this way)


Thanks

--
TH



___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Thomas HUMMEL

On 10/14/19 10:59 AM, Thomas HUMMEL wrote:


following service nodes: ,maestro-xcat.maestro.pasteur.fr


Sorry, I juste noticed I had the above typo in my site table :


"master",",maestro-xcat.maestro.pasteur.fr",,

Now I changed it to

"master","maestro-xcat.maestro.pasteur.fr",,

I get

# xdcp maestro-300 -F /opt/test/synclists/list.synclist
Error: [maestro-xcat]: Error from pping

But I can pping the node :

[root@maestro-xcat opt]# pping maestro-300
maestro-300: ping

Thanks.

--
TH


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


[xcat-user] xdcp -F / updatenode -F "Noderange missing"

2019-10-14 Thread Thomas HUMMEL

Hello,

Using xCAT-server-2.14.6 on CentOS 7.7 x86_64, I'm experiencing the 
following when trying a simple xdcp command :



# find /opt/test/
/opt/test/
/opt/test/synclists
/opt/test/synclists/list.synclist
/opt/test/foobar.txt

# cat /opt/test/foobar.txt
HELLO

# cat /opt/test/synclists/list.synclist
/opt/test/foobar.txt -> /tmp/foobar.txt

# xdcp maestro-300 -F /opt/test/synclists/list.synclist
Error: [maestro-xcat]: Noderange missing in command input
Error: [maestro-xcat]: 
File:/var/xcat/syncfiles/opt/test/synclists/list.synclist does not exist.
Error: [maestro-xcat]: Failed to dispatch command to any of the 
following service nodes: ,maestro-xcat.maestro.pasteur.fr


The same test on a similary configured older xCAT 
xCAT-server-2.11-snap201511300543.noarch / CentOs 6.10) works as expected


Note : I don't use any service node - or at least I don't think I do ;-) 
(so the "Failed to dispatch command to any of the following service 
nodes: ,maestro-xcat.maestro.pasteur.fr" confuses me).


In site table of the old xCAT are the following entries :

"SNsyncfiledir","/var/xcat/syncfiles",,
"nodesyncfiledir","/var/xcat/node/syncfiles",,

In site table of the new xCAT table is the same (I tried "/" for 
"SNsyncfiledir","/var/xcat/syncfiles",, as well but that does not change 
the "Noderange missing" issue)


Can you help me figure out what's going on ?


Thanks

--
Thomas HUMMEL




___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user