Hi Jan,


> Hi Sarah,
>
> when I was doing initial investigation, it seemed to me that
> at the moment useradd fails the system is left in inconsistent
> state because of steps which happened before useradd was invoked.
> This is the reason, why I thought we would need to clean up
> the target before installation is restarted after useradd failure.
>
> But then I found out that target instantiation (TI) and transfer (TM)
> phases are actually started after useradd finishes and that the
> inconsistency was caused by the fact that orchestrator proceeded
> with installation even if useradd failed.
>
> It means that if orchestrator returns with failure immediately
> when it finds out that useradd wasn't successful and doesn't start
> TI nor TM, no actual changes are done to the target, so no cleanup
> needs to be done and install can be restarted successfully.
>
> So this fix doesn't address the case when target is in inconsistent
> state before installer is invoked - for example if it would fail
> for some reason during TI or TM phase and thus target would be
> already instantiated and some bits already transfered.
>
> There is still possibility (not addressed by this bug) that
> for some reason installer crashes during TI or TM and then
> the system might be left in state which would prevent user
> from restarting installer successfully - the question might
> be if there are valid scenarios when restarting installer
> would help to solve the underlying issue in these cases.
> Do you think that these possibilities should be more
> investigated ?
>   
I ran in to these cases myself while debugging and fixing bug 533. 
Basically if an install fails in the middle for some reason even user 
error the user cannot restart the installer unless they cleanup the left 
over targets. That seems broken to me. The fix you put in for this bug 
and the fix I will put in for 533 will stop the orchestrator from 
continuing on in the event of failures prior to starting transfer.

My concern about not being able to restart the installer is that with 
the livecd environment in particular,  we don't disable the installer 
icon or program after a failed installation attempt. We tell them that 
TI failed, but we don't tell them how to fix it except they can reboot.

We have a few choices with this(IMO):

1. Generate better error messages regarding TI failure, and how to clean 
up the leftover configuration.
2. Disable the installer icon until the user cleans up the stuff causing 
the failures. This is likely harder to do than might seem obvious since 
how are we going to be able to track what they have done to clean things up.
3. Modify TI to recreate unconfigure any existing configuration and 
recreate the zpool/zfs datasets if doing an initial install. This means 
the installer won't fail due to this.

I do think that we need to take a look at the robustness of both TI and 
TM so that we can handle some unexpected situations better.

The code changes for this bug are fine for putback.

thanks,
sarah
****
> Thank you very much for the review,
> Jan
>
>
> Sarah Jelinek wrote:
>   
>> Hi Jan,
>>
>> This makes sense to cleanup if the useradd doesn't succeed. but how does 
>> this fix cleanup the other issues noted in the bug? Or is this fix 
>> intended to do that?
>>
>> sarah
>> ***
>> jan damborsky wrote:
>>     
>>> Hi Sarah, Sundar,
>>>
>>> could I please ask you to review changes for
>>> following bug ?
>>>
>>> 88 installer goes nuts when useradd fails
>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=88
>>>
>>> Webrev is available at
>>> http://cr.opensolaris.org/~dambi/bug-88/
>>>
>>> Thank you very much,
>>> Jan
>>>
>>>
>>>
>>>       
>> _______________________________________________
>> caiman-discuss mailing list
>> caiman-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
>>     
>
> _______________________________________________
> caiman-discuss mailing list
> caiman-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
>
>   

Reply via email to