Re: [xcat-user] [External] BitTorrent distribution of stateless images with xCAT interesting to anyone?

2023-04-01 Thread Vinícius Ferrão via xCAT-user
Lenovo’s provisioner:


Downloads
hpc.lenovo.com
[icon.png]


Sent from my iPhone

On 1 Apr 2023, at 10:41, Tomer Shachaf  wrote:


Can anybody explain me what is confluent?

בברכה ,

תומר שחף | מהנדס אינטגרציה ותשתיות | חטיבת אינטגרציה ותשתיות | מטריקס | נייד 
054-2686841 |
tomers...@matrix.co.il | 
www.matrix.co.il
[image001.jpg]


On 29 Mar 2023, at 20:40, Jarrod Johnson  wrote:


For reference, I did a couple of bittorrent style diskless as a project years 
ago.  Didn't ever mainstream it though.  In the end the performance uplift 
wasn't as noticeable as one might have guessed, for an environment where the 
boot servers had at least 10G.

Note that nowadays I've moved my development attention to confluent.  Also 
note, confluent never pushes private ssh keys (node to node ssh when enabled is 
facilitated through SSH certificate authority and helper to generate 
shosts.equiv).

On confluent diskless, there is an interesting benefit that becomes a challenge 
for bittorrent: a typical diskless node never downloads the whole diskless 
image.  This means less ram sucked up by the diskless image, and also that the 
diskless image can be large without pruning. Further, even the bits 
'downloaded' are permitted to be erased as needed by the kernel memory 
management, so the current expectation is that we don't expend resources on a 
diskless node to retain the image unless we absolutely need it. So a typical 
bittorrent flow would erode this benefit.

One could imagine a bittorrent scenario that would erode less of the value but 
would still come at a price.  If the similar trick were done to only torrent 
the parts as needed locally, then the critical portion for boot would be memory 
resident on each node.  We would still lose the ability for the kernel to free 
up that memory (either as needed or drop_cache), and much of the boot up 
contents do not need to be read again, so dropping their cache after boot can 
offer benefit.

Incidentally, another facet about the diskless image difference between xCAT 
and confluent, the diskless images are now encrypted.  This affords protection 
in case your diskless image contains some sensitive material.  The decryption 
key is available through the confluent API, and is generally authenticated by 
node TPM, so a diskless node persists trust through having the same TPM that 
had been previously authenticated. This fact allows the transport security to 
matter less, though our security policies are pretty insistent that https be 
used at all times.

I would be interested in developing torrent style boot design with confluent, 
with lower hanging fruit of 'untethered' mode, which is still available and 
does download the image (at the expense of ram usage).  Interestingly, the 
logic is no longer inside the packed initramfs, but is loose in the profile.  
The link to RedHat 9 style diskless bootstrap is:
https://github.com/lenovo/confluent/blob/master/confluent_osdeploy/el9-diskless/profiles/default/scripts/imageboot.sh
[https://opengraph.githubassets.com/1f19a279adcddae426f052b5f40da5903b2b87eebc6c45409caf258f36bfab8c/lenovo/confluent]
confluent/imageboot.sh at master · 
lenovo/confluent
Confluent Cluster Management software. Contribute to lenovo/confluent 
development by creating an account on GitHub.
github.com
Notably:

if [ "untethered" = "$(getarg confluent_imagemethod)" ]; then
mount -t tmpfs untethered /mnt/remoteimg
curl 
https://$confluent_whost/confluent-public/os/$confluent_profile/rootimg.sfs -o 
/mnt/remoteimg/rootimg.sfs
else
confluent_urls="$confluent_urls 
https://$confluent_whost/confluent-public/os/$confluent_profile/rootimg.sfs;
/opt/confluent/bin/urlmount $confluent_urls /mnt/remoteimg
fi

Is the logic for getting the image.  One thing to note is that a typical 
diskless image boot in confluent, the booted system does not​ see rootimg.sfs, 
so the torrent execution would have to stay in the 'initramfs' world (which 
does persist after boot, as a separate mount namespace)






From: Dr. Thomas Orgis 
Sent: Wednesday, March 29, 2023 11:37 AM
To: xCAT Users Mailing list 
Subject: [External] [xcat-user] BitTorrent distribution of stateless images 
with xCAT interesting to anyone?

Hi,

I first got into contact with xCAT through our HPC installed in 2015,
with xCAT version … hm …

# nodels --version
Version 2.9.1 (git commit 7f6043fffd62d482931b17b60f9488eb5754fdc1, built Thu 
Mar 19 03:25:35 EDT 2015)

2.9.1 seems to be it. The base 

Re: [xcat-user] [External] BitTorrent distribution of stateless images with xCAT interesting to anyone?

2023-04-01 Thread Tomer Shachaf
Can anybody explain me what is confluent?

בברכה ,

תומר שחף | מהנדס אינטגרציה ותשתיות | חטיבת אינטגרציה ותשתיות | מטריקס | נייד 
054-2686841 |
tomers...@matrix.co.il | 
www.matrix.co.il
[image001.jpg]


On 29 Mar 2023, at 20:40, Jarrod Johnson  wrote:


For reference, I did a couple of bittorrent style diskless as a project years 
ago.  Didn't ever mainstream it though.  In the end the performance uplift 
wasn't as noticeable as one might have guessed, for an environment where the 
boot servers had at least 10G.

Note that nowadays I've moved my development attention to confluent.  Also 
note, confluent never pushes private ssh keys (node to node ssh when enabled is 
facilitated through SSH certificate authority and helper to generate 
shosts.equiv).

On confluent diskless, there is an interesting benefit that becomes a challenge 
for bittorrent: a typical diskless node never downloads the whole diskless 
image.  This means less ram sucked up by the diskless image, and also that the 
diskless image can be large without pruning. Further, even the bits 
'downloaded' are permitted to be erased as needed by the kernel memory 
management, so the current expectation is that we don't expend resources on a 
diskless node to retain the image unless we absolutely need it. So a typical 
bittorrent flow would erode this benefit.

One could imagine a bittorrent scenario that would erode less of the value but 
would still come at a price.  If the similar trick were done to only torrent 
the parts as needed locally, then the critical portion for boot would be memory 
resident on each node.  We would still lose the ability for the kernel to free 
up that memory (either as needed or drop_cache), and much of the boot up 
contents do not need to be read again, so dropping their cache after boot can 
offer benefit.

Incidentally, another facet about the diskless image difference between xCAT 
and confluent, the diskless images are now encrypted.  This affords protection 
in case your diskless image contains some sensitive material.  The decryption 
key is available through the confluent API, and is generally authenticated by 
node TPM, so a diskless node persists trust through having the same TPM that 
had been previously authenticated. This fact allows the transport security to 
matter less, though our security policies are pretty insistent that https be 
used at all times.

I would be interested in developing torrent style boot design with confluent, 
with lower hanging fruit of 'untethered' mode, which is still available and 
does download the image (at the expense of ram usage).  Interestingly, the 
logic is no longer inside the packed initramfs, but is loose in the profile.  
The link to RedHat 9 style diskless bootstrap is:
https://github.com/lenovo/confluent/blob/master/confluent_osdeploy/el9-diskless/profiles/default/scripts/imageboot.sh
[https://opengraph.githubassets.com/1f19a279adcddae426f052b5f40da5903b2b87eebc6c45409caf258f36bfab8c/lenovo/confluent]
confluent/imageboot.sh at master · 
lenovo/confluent
Confluent Cluster Management software. Contribute to lenovo/confluent 
development by creating an account on GitHub.
github.com
Notably:

if [ "untethered" = "$(getarg confluent_imagemethod)" ]; then
mount -t tmpfs untethered /mnt/remoteimg
curl 
https://$confluent_whost/confluent-public/os/$confluent_profile/rootimg.sfs -o 
/mnt/remoteimg/rootimg.sfs
else
confluent_urls="$confluent_urls 
https://$confluent_whost/confluent-public/os/$confluent_profile/rootimg.sfs;
/opt/confluent/bin/urlmount $confluent_urls /mnt/remoteimg
fi

Is the logic for getting the image.  One thing to note is that a typical 
diskless image boot in confluent, the booted system does not​ see rootimg.sfs, 
so the torrent execution would have to stay in the 'initramfs' world (which 
does persist after boot, as a separate mount namespace)






From: Dr. Thomas Orgis 
Sent: Wednesday, March 29, 2023 11:37 AM
To: xCAT Users Mailing list 
Subject: [External] [xcat-user] BitTorrent distribution of stateless images 
with xCAT interesting to anyone?

Hi,

I first got into contact with xCAT through our HPC installed in 2015,
with xCAT version … hm …

# nodels --version
Version 2.9.1 (git commit 7f6043fffd62d482931b17b60f9488eb5754fdc1, built Thu 
Mar 19 03:25:35 EDT 2015)

2.9.1 seems to be it. The base system is CentOS 7.x. Since the system
was an en bloc purchase, we never updated xCAT, but I just adapted it
to our needs and then let it do its thing over the years. I did some
little changes, like fixing up /etc/hostname in initrd (not sure if
that was a specific mixup in our setup with long and short hostnames)