[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167698#comment-13167698
 ] 

Steve Loughran commented on MAPREDUCE-3502:
-------------------------------------------

Improved patch
 # javadoc of {{Service}} interface and {{AbstractService}} so as to more 
clearly specify the desired behaviour of implementations.
 # common methods in {{AbstractService}} to interrupt a non-null thread, stop 
non-null IPC and webapp servers. These calls return null values to overwrite 
field variables after the successful operation.
 # static methods to stop a service if non null, and one to do it with 
exceptions caught and logged.

All subclasses have been reviewed and their stop operations
 # use the new common interrupt and stop methods where appropriate
 # use checks for null fields before trying any stop operation
 # nullify all non-final fields after their work.

Some -but not all- of the {{stop()}} methods join onto the thread that was just 
interrupted; to wait for it to stop. This would be the cleanest option, as it 
guarantees the worker threads will not invoke a (now-stopped) service. I have 
not changed the behaviour of any existing services that do not perform 
{{Thread.join()}} operations. If there was a well defined behaviour for that 
join (time to wait, exception to throw on failure), it could be injected into 
the new {{AbstractService.interruptThread()}} method, and used throughout the 
services. Having a consistent interrupt/join/report failure process would imply 
only one thing to test, and more trust that new services will follow the 
examples set in the MRv2 codebase.

This patch does nothing to address the issue of MAPREDUCE-3535 -that the 
stoppability of a service is not checked before child classes terminate; it is 
somewhat orthogonal.

This patch is designed to ensure that no matter when {{Service.stop()}} is 
called (except in the special situation of a re-entrant call on a separate 
thread), everything is shut down cleanly. This means that even if the failure 
is triggered halfway through the {{init(Conf)}} or {{start()}} operations the 
{{stop()}} operation will always clean up what there is. 
 
                
> Review all Service.stop() operations and make sure that they work before a 
> service is started
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3502
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3502
>             Project: Hadoop Map/Reduce
>          Issue Type: Task
>          Components: mrv2
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: MAPREDUCE-3502.patch, MAPREDUCE-3502.patch
>
>   Original Estimate: 24h
>          Time Spent: 0.5h
>  Remaining Estimate: 23.5h
>
> MAPREDUCE-3431 has shown that some of the key services's shutdown operations 
> are not robust against being invoked before the service is started. They need 
> to be by
> # not calling other things if the other things are null
> # not being re-entrant (i.e. make synchronized if possible), 
> Maybe 
> # have a StopService operation that only stops a service if it is live
> # factor out the is-running test from the base service class and make it a 
> pre-check for all the child services, so they bail out sooner rather than 
> later. This would be the best as it would be the one guaranteed to work 
> consistently across all instances, so only one or two would need testing
> my first iteration will skip the sync though it's something to consider. 
> Testing: try to create each instance; call stop() straight after 
> construction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to