Hi Sarah,
Sarah Jelinek wrote: > Hi Jan, > > > >> Hi Sarah, >> >> when I was doing initial investigation, it seemed to me that >> at the moment useradd fails the system is left in inconsistent >> state because of steps which happened before useradd was invoked. >> This is the reason, why I thought we would need to clean up >> the target before installation is restarted after useradd failure. >> >> But then I found out that target instantiation (TI) and transfer (TM) >> phases are actually started after useradd finishes and that the >> inconsistency was caused by the fact that orchestrator proceeded >> with installation even if useradd failed. >> >> It means that if orchestrator returns with failure immediately >> when it finds out that useradd wasn't successful and doesn't start >> TI nor TM, no actual changes are done to the target, so no cleanup >> needs to be done and install can be restarted successfully. >> >> So this fix doesn't address the case when target is in inconsistent >> state before installer is invoked - for example if it would fail >> for some reason during TI or TM phase and thus target would be >> already instantiated and some bits already transfered. >> >> There is still possibility (not addressed by this bug) that >> for some reason installer crashes during TI or TM and then >> the system might be left in state which would prevent user >> from restarting installer successfully - the question might >> be if there are valid scenarios when restarting installer >> would help to solve the underlying issue in these cases. >> Do you think that these possibilities should be more >> investigated ? >> > I ran in to these cases myself while debugging and fixing bug 533. > Basically if an install fails in the middle for some reason even user > error the user cannot restart the installer unless they cleanup the left > over targets. That seems broken to me. I see - thinking more about that particular scenario I agree with you installer should be able to deal with this situation in some way. > The fix you put in for this bug > and the fix I will put in for 533 will stop the orchestrator from > continuing on in the event of failures prior to starting transfer. Agreed. > > My concern about not being able to restart the installer is that with > the livecd environment in particular, we don't disable the installer > icon or program after a failed installation attempt. We tell them that > TI failed, but we don't tell them how to fix it except they can reboot. > > We have a few choices with this(IMO): > > 1. Generate better error messages regarding TI failure, and how to clean > up the leftover configuration. > 2. Disable the installer icon until the user cleans up the stuff causing > the failures. This is likely harder to do than might seem obvious since > how are we going to be able to track what they have done to clean things up. > 3. Modify TI to recreate unconfigure any existing configuration and > recreate the zpool/zfs datasets if doing an initial install. This means > the installer won't fail due to this. I have taken a look at TI part and played for a while with the installer. After invoking it from command line, I let him finish TI and start TM. Then I interrupted the install process and tried to clean up the system, so that installer could be restarted. So far it seems to me that "zpool destroy -f rpool" could take care of cleaning up everything related to TI ZFS targets (root pool, datasets, ...), which should be sufficient for TI to allow successfully instantiate target again. I am not sure if there is anything we should be aware regarding TM part - I am CCing Moinak, who might be able to provide this information. Also, we probably would need to think about changes done by orchestrator - for example I have noticed that I can't create the same user if I restart installer - not sure if something else in the system might be affected as well. > > I do think that we need to take a look at the robustness of both TI and > TM so that we can handle some unexpected situations better. > > The code changes for this bug are fine for putback. Thank you for the review, Jan > > thanks, > sarah > **** >> Thank you very much for the review, >> Jan >> >> >> Sarah Jelinek wrote: >> >>> Hi Jan, >>> >>> This makes sense to cleanup if the useradd doesn't succeed. but how does >>> this fix cleanup the other issues noted in the bug? Or is this fix >>> intended to do that? >>> >>> sarah >>> *** >>> jan damborsky wrote: >>> >>>> Hi Sarah, Sundar, >>>> >>>> could I please ask you to review changes for >>>> following bug ? >>>> >>>> 88 installer goes nuts when useradd fails >>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=88 >>>> >>>> Webrev is available at >>>> http://cr.opensolaris.org/~dambi/bug-88/ >>>> >>>> Thank you very much, >>>> Jan >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> caiman-discuss mailing list >>> caiman-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss >>> >> _______________________________________________ >> caiman-discuss mailing list >> caiman-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss >> >> > _______________________________________________ > caiman-discuss mailing list > caiman-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
