Hi Karen.

LGTM.

    Thanks,
    Jack

On 06/16/11 04:28 PM, Karen Tung wrote:
Thank you Dave and Jack for the code review.

I updated the code based on our email discussions, and the webrev
is updated in place.

http://cr.opensolaris.org/~ktung/7050253/

I also redo testing to make sure the updates didn't cause any regression.

Please let me know if you have any further comments.

Thanks,

--Karen


On 06/16/11 11:21, Karen Tung wrote:
Hi Jack,

Please see my comments inline.

On 06/16/11 08:58, Jack Schwartz wrote:
Hi Dave.

When I reviewed I wondered the same thing... It appears that all modules which use the progress monitor set "done" in a "finally" clause for exception handling that envelops the transfer, so it seems safe to me. The thing I was wondering is who's responsibility is it to guard against hanging?

If it is the progress monitor itself that is responsible, then maybe a heartbeat or timeout mechanism is needed where the client module (ips or cpio) periodically hits a variable to let the monitor know the client is still alive. But maybe this is over-engineering for our purposes.

If it is the client module that is responsible, then things look fine to me the way they are.
I would think that it would be the caller's responsibility to instruct the
progress monitoring thread on when to stop.

The progress monitoring thread exits under 2 conditions:

1) the client setting the "done" variable
2) the expected amount of data to be transferred is done.  The expected
amount of data to be transferred is passed in by the client when starting
the progress monitoring thread.

In the case of this bug, because I was trying to improve
the long delay in the start of progress reporting for the whole install
when I was working on the cud_ti project,
I put in a workaround to just return a fake value for the
amount of data to be transferred.  So, the exit condition 2 can't
be used, and we are just relying on exit condition 1.

I am working on improving the overall progress reporting experience
for the installers and I will be removing the hack to return the fake value.
After everything is fixed, the amount of data to be monitored will be
real.

To make absolutely sure that we have no hang, we can put a timeout
in the ProgressMon.wait() function.  This function is called when the
client is waiting for the thread to end gracefully after it sets the "done" variable.
When we reach the time out, the ProgressMon.wait() function can return
and the client and continue on.  In the rare case that the progress
monitoring thread really didn't end for some reason after the timeout,
it will just be left around until the application ends, which I think is
better than the application hanging.

Putting the timeout in is actually very easy to do, I will add this as part of
this putback too.

Thanks,

--Karen

    Thanks,
    Jack

On 06/16/11 08:29 AM, Dave Miner wrote:
On 06/16/11 02:58 AM, Karen Tung wrote:
Hi,

I would like to get a code review for the changes to fix the following bug:

7050253 Net boot text installer hangs all the time in systems with 8 GB
or more memory.

webrev:
http://cr.opensolaris.org/~ktung/7050253/


Karen, the only question I had is whether there's any way the file system creation fails and leaves things hung up in the loop at 88-89 here?

Dave
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss


_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Reply via email to