[ https://issues.apache.org/jira/browse/MAPREDUCE-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167698#comment-13167698 ]
Steve Loughran commented on MAPREDUCE-3502: ------------------------------------------- Improved patch # javadoc of {{Service}} interface and {{AbstractService}} so as to more clearly specify the desired behaviour of implementations. # common methods in {{AbstractService}} to interrupt a non-null thread, stop non-null IPC and webapp servers. These calls return null values to overwrite field variables after the successful operation. # static methods to stop a service if non null, and one to do it with exceptions caught and logged. All subclasses have been reviewed and their stop operations # use the new common interrupt and stop methods where appropriate # use checks for null fields before trying any stop operation # nullify all non-final fields after their work. Some -but not all- of the {{stop()}} methods join onto the thread that was just interrupted; to wait for it to stop. This would be the cleanest option, as it guarantees the worker threads will not invoke a (now-stopped) service. I have not changed the behaviour of any existing services that do not perform {{Thread.join()}} operations. If there was a well defined behaviour for that join (time to wait, exception to throw on failure), it could be injected into the new {{AbstractService.interruptThread()}} method, and used throughout the services. Having a consistent interrupt/join/report failure process would imply only one thing to test, and more trust that new services will follow the examples set in the MRv2 codebase. This patch does nothing to address the issue of MAPREDUCE-3535 -that the stoppability of a service is not checked before child classes terminate; it is somewhat orthogonal. This patch is designed to ensure that no matter when {{Service.stop()}} is called (except in the special situation of a re-entrant call on a separate thread), everything is shut down cleanly. This means that even if the failure is triggered halfway through the {{init(Conf)}} or {{start()}} operations the {{stop()}} operation will always clean up what there is. > Review all Service.stop() operations and make sure that they work before a > service is started > --------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-3502 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3502 > Project: Hadoop Map/Reduce > Issue Type: Task > Components: mrv2 > Affects Versions: 0.23.0, 0.24.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Attachments: MAPREDUCE-3502.patch, MAPREDUCE-3502.patch > > Original Estimate: 24h > Time Spent: 0.5h > Remaining Estimate: 23.5h > > MAPREDUCE-3431 has shown that some of the key services's shutdown operations > are not robust against being invoked before the service is started. They need > to be by > # not calling other things if the other things are null > # not being re-entrant (i.e. make synchronized if possible), > Maybe > # have a StopService operation that only stops a service if it is live > # factor out the is-running test from the base service class and make it a > pre-check for all the child services, so they bail out sooner rather than > later. This would be the best as it would be the one guaranteed to work > consistently across all instances, so only one or two would need testing > my first iteration will skip the sync though it's something to consider. > Testing: try to create each instance; call stop() straight after > construction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira