Ok, After studying a little bit more the problem, I have a question that I have no answer for.
Why have si_netbootmond been created while a better place to add this monitoring would have been the monitoring source: si_monitor. Indeed, si_monitor is aware in real time of the "imaged" status and this, whatever the deployment solution is (bittorrent, rsyncd, flamethrower, ...) In my precedent post, I though that looking for status 102 (rebooted) was the solution, unfortunately, if client is set to netboot, a rinstall loop will occure before the rebooted status had a chance to be sent to si_monitor. So, I think that a rock solid solution would be to have the following algoryhtm set in si_monitor. have a si_monitor configuration parameter to enable or disable the netbootmond feature. if enabled, when receiving "imaged" (status 100) (or maybe finalizing (101) or even rebooting (104), and if NET_BOOT_DEFAULT is set to local, then it would run a: si_mkclientnetboot --localboot --clients "<the client>" and optionally start a timer. (configurable). When timer expires, it would check the /var/lib/systemimager/clients.xml for that client, and if "rebooted (102) status is not there, then assumes that reboot failed (bad boot loader, wrong fstab, garbage from postinstall script, ...) and revert the client to netboot so another imaging attempt can occure. I admit that the netboot disabling could lead to endless loop doing failed reimaging, but we could think about putting a no timout netboot waiting for a keypress or a poweroff netboot or any other "on-fail-to-reboot" configurable behavior. Anyway, aside the "timer" option, the main purpose here is: why not integrating si_netbootmond into si_monitor as we can benefit of real-time client status without active waiting and we are independant of the deploying method. (and it is also not that difficult to code mainly at line si_monitor:390) if(($client->{'status'} == 100) && $netbootmond_option_enabled) { system("si_mkclientnetboot --localboot --clients \"$client->{'name'}\""); } What do you think? Best regards, Olivier. Le mercredi 23 juillet 2014 16:37:52 Olivier LAHAYE a écrit : > Dear all, > > I'm working on si_netbootmond (in fact working on OSCAR side) and discovered > that si_netbootmond rely on the fact that the imaging method is rsync. > It monitors the rsyncd log file for magic works stating that imaging is > complete. > > There are 2 problems here: > 1/ It doesn't work for deployment using flamethrower / multicast or > bittorrent 2/ If a client is successfully imaged, this doesn't mean that it > is able to reboot (bad image, bad bootloader, ...) in this situation, > setting local boot is wrong as we certainly want to boot from net again in > order to re-image. > > I see 3 solutions: > 1/ replace the actual code that scans /var/log/systemimager/rsyncd for magic > words: /scripts\/imaging_complete_?([\.0-9]+)?/ with something that would > do something like: > > use File::Monitor; > my $monitor = File::Monitor->new(); > $monitor->watch('/var/lib/systemimager/clients.xml'); > In a loop, searching for clients with status 102 (REBOOTED) > and run: > si_mkclientnetboot --localboot --clients "<the client>" > > 2/ Keep actual code and add an option to monitor client.xml instead > > 3/ drop or left si_netbootmond untouched and add an option to unable > updating the netboot to local in si_monitor. Indeed, si_monitor won't > require an active checking of a file. it is in blocking state on a socket > listenning for infos. if it receives message rebooted, then it could call > a: > si_mkclientnetboot --localboot --clients "<the client>" > > Technically, solution 3 has the advantage of being passive while methode 1 > and 2 are active listenning of node status change. Problem of solution 3 is > that I don't know if it's logic to have the hability to update netboot of > client in si_monitor, and I if si_monitor is not started, then no > possibility to update netboot. > > What is the best solution? > > Regards, > > Olivier. Cordialement, Olivier. -- Olivier Lahaye DRT/LIST/DIR ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ sisuite-devel mailing list sisuite-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/sisuite-devel