On 22/02/12 17:58, Mike Burns wrote: > On Wed, 2012-02-22 at 17:33 +0200, Doron Fediuck wrote: >> On 22/02/12 16:57, Mike Burns wrote: >>> There has been a lot of interest in being able to run stateless Nodes >>> with ovirt-engine. ovirt-node has designed a way [1] to achieve this on >>> the node side, but we need input from the engine and vdsm teams to see >>> if we're missing some requirement or if there needs to be changes on the >>> engine/vdsm side to achieve this. >>> >>> As it currently stands, every time you reboot an ovirt-node that is >>> stateless, it would require manually removing the host in engine, then >>> re-registering/approving it again in engine. >>> >>> Any thoughts, concerns, input on how to solve this? >>> >>> Thanks >>> >>> Mike >>> >>> [1] http://ovirt.org/wiki/Node_Stateless >>> >> >> Some points need to be considered; >> >> - Installation issues >> >> * Just stating the obvious, which is users need >> to remove-add the host on every reboot. This will >> not make this feature a lovable one from user's point of view. > > Yes, this is something that will cause this to be a non-starter. We'd > need to change something in the engine/vdsm to make it smoother. > Perhaps, a flag in engine on the host saying that it's stateless. Then > if a host comes up with the same information, but no certs, etc, it > would validate some other embedded key (TPM, key embedded in the node > itself), and auto-approve it to be the same state as the previous boot > This will require some thinking.
>> >> * During initial boot, vdsm-reg configures the networking >> and creates a management network bridge. This is a very >> delicate process which may fail due to networking issues >> such as resolution, routing, etc. So re-doing this on >> every boot increases the chances of loosing a node due >> to network problems. > > vdsm-reg runs on *every* boot anyway and renames the bridge. This is > something that was debated previously, but it was decided to re-run it > every boot. > Close, but not exactly; vdsm-reg will run on every boot, but if the relevant bridge is found, then networking is unchanged. >> >> * CA pollution; generating a certificate on each reboot >> for each node will create a huge number of certificates >> in the engine side, which eventually may damage the CA. >> (Unsure if there's a limitation to certificates number, >> but having hundreds of junk cert's can't be good). > > We could have vdsm/engine store the certs on the engine side, and on > boot, after validating the host (however that is done), it will load the > certs onto the node machine. > This is a security issue, since the key pair should be generated on the node. This will lead us back to your TPM suggestion, but (although I like it, ) will cause us to be tpm-dependent, not to mention a non-trivial implementation. >> >> * Today there's a supported flow that for nodes with >> password, the user is allowed to use the "add host" >> scenario. For stateless, it means re-configuring a password >> on every boot... > > Stateless is really targeted for a PXE environment. There is a > supported kernel param that can be set that will set this password. > Also, if we follow the design mentioned ^^, then it's not an issue since > the host will auto-approve itself when it connects > >> >> - Other issues >> >> * Local storage; so far we were able to define a local >> storage in ovirt node. Stateless will block this ability. > > Yes, this would be unavailable if you're running stateless. I think > that's a fine tradeoff since people want the host to be diskless. >> >> * Node upgrade; currently it's possible to upgrade a node >> from the engine. In stateless it will error, since no where >> to d/l the iso file to. > > Upgrade is handled easily by rebooting the host after updating the pxe > server > >> >> * Collecting information; core dumps and logging may not >> be available due to lack of space? Or will it cause kernel >> panic if all space is consumed? > > A valid concern, but a stateless environment would likely have > collectd/rsyslog/netconsole servers running elsewhere that will collect > the logs. kdumps can be configured to dump remotely as well. This will also need some work on the vdsm side. >> > > Another concern raised is swap and overcommit. First version would > likely disable swap completely. This would disable overcommit as well. > Future versions could enable a local disk to be used completely for > swap, but that is another tradeoff that people would need to evaluate > when choosing between stateless and stateful installs. Indeed so- completely forgot about swap... > > Mike > -- /d “Funny,” he intoned funereally, “how just when you think life can't possibly get any worse it suddenly does.” --Douglas Adams, The Hitchhiker's Guide to the Galaxy _______________________________________________ node-devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/node-devel
