Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
On March 20, 2015, 3:52 p.m., Joe Smith wrote: It seems like the `self.quitquitquit` is the important part (on line 340 of the runner)- doesn't decreasing the timeout not give `quitquitquit` the time it needs? - Joe --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review77296 --- On March 18, 2015, 6:20 p.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 18, 2015, 6:20 p.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
On March 20, 2015, 3:52 p.m., Joe Smith wrote: Joe Smith wrote: It seems like the `self.quitquitquit` is the important part (on line 340 of the runner)- doesn't decreasing the timeout not give `quitquitquit` the time it needs? In `src/main/python/apache/aurora/executor/thermos_task_runner.py` ``` 331 waited = Amount(0, Time.SECONDS) 332 while self.is_alive and waited timeout: 333 self._clock.sleep(self.POLL_INTERVAL.as_(Time.SECONDS)) 334 waited += self.POLL_INTERVAL 335 336 if not self.is_alive and self.task_state() != TaskState.ACTIVE: 337 return 338 339 log.info('Thermos task did not shut down cleanly, rebinding to kill.') 340 self.quitquitquit() 341 342 while not self._monitor.finished and waited timeout: 343 self._clock.sleep(self.POLL_INTERVAL.as_(Time.SECONDS)) 344 waited += self.POLL_INTERVAL ``` Is it that we need to reset waited to Amount(0, Time.SECONDS) ? - Joe --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review77296 --- On March 18, 2015, 6:20 p.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 18, 2015, 6:20 p.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review77296 --- src/test/python/apache/aurora/executor/test_thermos_task_runner.py https://reviews.apache.org/r/32221/#comment125192 Maybe also decrease ``` 366poll_interval=Amount(500, Time.MILLISECONDS), ``` ? - Joe Smith On March 18, 2015, 6:20 p.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 18, 2015, 6:20 p.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
On March 18, 2015, 11:01 p.m., Aurora ReviewBot wrote: Master (6396410) is red with this patch. ./build-support/jenkins/build.sh src.test.python.apache.aurora.client.cli.plugins . SUCCESS src.test.python.apache.aurora.client.cli.quota . SUCCESS src.test.python.apache.aurora.client.cli.sla . SUCCESS src.test.python.apache.aurora.client.cli.supdate . SUCCESS src.test.python.apache.aurora.client.cli.task . SUCCESS src.test.python.apache.aurora.client.cli.update . SUCCESS src.test.python.apache.aurora.client.cli.version . SUCCESS src.test.python.apache.aurora.client.config . SUCCESS src.test.python.apache.aurora.client.factory . SUCCESS src.test.python.apache.aurora.client.hooks.hooked_api . SUCCESS src.test.python.apache.aurora.client.hooks.non_hooked_api . SUCCESS src.test.python.apache.aurora.common.test_aurora_job_key . SUCCESS src.test.python.apache.aurora.common.test_cluster . SUCCESS src.test.python.apache.aurora.common.test_cluster_option . SUCCESS src.test.python.apache.aurora.common.test_clusters . SUCCESS src.test.python.apache.aurora.common.test_http_signaler . SUCCESS src.test.python.apache.aurora.common.test_pex_version . SUCCESS src.test.python.apache.aurora.common.test_shellify . SUCCESS src.test.python.apache.aurora.common.test_transport . SUCCESS src.test.python.apache.aurora.config.test_base . SUCCESS src.test.python.apache.aurora.config.test_constraint_parsing . SUCCESS src.test.python.apache.aurora.config.test_loader . SUCCESS src.test.python.apache.aurora.config.test_thrift . SUCCESS src.test.python.apache.aurora.executor.common.path_detector . SUCCESS src.test.python.apache.aurora.executor.common.task_info . SUCCESS src.test.python.apache.aurora.executor.executor_base . SUCCESS src.test.python.apache.aurora.executor.executor_vars . SUCCESS src.test.python.apache.aurora.executor.status_manager . SUCCESS src.test.python.apache.aurora.executor.thermos_task_runner . FAILURE src.test.python.apache.thermos.cli.commands.commands . SUCCESS src.test.python.apache.thermos.cli.common . SUCCESS src.test.python.apache.thermos.cli.main . SUCCESS src.test.python.apache.thermos.common.test_pathspec . SUCCESS src.test.python.apache.thermos.core.test_runner_integration . SUCCESS src.test.python.apache.thermos.monitoring.test_disk . SUCCESS FAILURE [31m FAILURE[0m I will refresh this build result if you post a review containing @ReviewBot retry Brian Wickman wrote: welp my kingdom for reviewbot to print out stderr logs - Brian --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76965 --- On March 18, 2015, 10:44 p.m., Brian Wickman wrote:
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76992 --- @ReviewBot retry - Brian Wickman On March 19, 2015, 1:20 a.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 19, 2015, 1:20 a.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76991 --- Ship it! Master (6396410) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On March 19, 2015, 1:20 a.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 19, 2015, 1:20 a.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76999 --- Ship it! Master (6396410) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On March 19, 2015, 1:20 a.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 19, 2015, 1:20 a.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 19, 2015, 1:20 a.m.) Review request for Aurora and Bill Farner. Changes --- The low timeout is actually what allows the test to even work in the first place. It should've worked in the past, but at least this way we can possibly eke out the original reason it was failing. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs (updated) - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76965 --- Master (6396410) is red with this patch. ./build-support/jenkins/build.sh src.test.python.apache.aurora.client.cli.plugins . SUCCESS src.test.python.apache.aurora.client.cli.quota . SUCCESS src.test.python.apache.aurora.client.cli.sla . SUCCESS src.test.python.apache.aurora.client.cli.supdate . SUCCESS src.test.python.apache.aurora.client.cli.task . SUCCESS src.test.python.apache.aurora.client.cli.update . SUCCESS src.test.python.apache.aurora.client.cli.version . SUCCESS src.test.python.apache.aurora.client.config . SUCCESS src.test.python.apache.aurora.client.factory . SUCCESS src.test.python.apache.aurora.client.hooks.hooked_api . SUCCESS src.test.python.apache.aurora.client.hooks.non_hooked_api . SUCCESS src.test.python.apache.aurora.common.test_aurora_job_key . SUCCESS src.test.python.apache.aurora.common.test_cluster . SUCCESS src.test.python.apache.aurora.common.test_cluster_option . SUCCESS src.test.python.apache.aurora.common.test_clusters . SUCCESS src.test.python.apache.aurora.common.test_http_signaler . SUCCESS src.test.python.apache.aurora.common.test_pex_version . SUCCESS src.test.python.apache.aurora.common.test_shellify . SUCCESS src.test.python.apache.aurora.common.test_transport . SUCCESS src.test.python.apache.aurora.config.test_base . SUCCESS src.test.python.apache.aurora.config.test_constraint_parsing . SUCCESS src.test.python.apache.aurora.config.test_loader . SUCCESS src.test.python.apache.aurora.config.test_thrift . SUCCESS src.test.python.apache.aurora.executor.common.path_detector . SUCCESS src.test.python.apache.aurora.executor.common.task_info . SUCCESS src.test.python.apache.aurora.executor.executor_base . SUCCESS src.test.python.apache.aurora.executor.executor_vars . SUCCESS src.test.python.apache.aurora.executor.status_manager . SUCCESS src.test.python.apache.aurora.executor.thermos_task_runner . FAILURE src.test.python.apache.thermos.cli.commands.commands . SUCCESS src.test.python.apache.thermos.cli.common . SUCCESS src.test.python.apache.thermos.cli.main . SUCCESS src.test.python.apache.thermos.common.test_pathspec . SUCCESS src.test.python.apache.thermos.core.test_runner_integration . SUCCESS src.test.python.apache.thermos.monitoring.test_disk . SUCCESS FAILURE [31m FAILURE[0m I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On March 18, 2015, 10:44 p.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 18, 2015, 10:44 p.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76998 --- @ReviewBot retry - Brian Wickman On March 19, 2015, 1:20 a.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 19, 2015, 1:20 a.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman
Re: Review Request 32221: Remove excessively low timeout in SIGTERM swallowing test.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/#review76997 --- Ship it! Master (6396410) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On March 19, 2015, 1:20 a.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32221/ --- (Updated March 19, 2015, 1:20 a.m.) Review request for Aurora and Bill Farner. Bugs: AURORA-1054 https://issues.apache.org/jira/browse/AURORA-1054 Repository: aurora Description --- Remove excessively low timeout in SIGTERM swallowing test. Diffs - src/test/python/apache/aurora/executor/test_thermos_task_runner.py 6b24bbb2ab7ca16f97961aabeed945b61e5b5908 Diff: https://reviews.apache.org/r/32221/diff/ Testing --- Cannot reproduce locally, but 5 seconds is an impossibly small timeout, even if we aren't testing SIGTERM swallowing. If this fails, we will get tripped by 60s timeout instead. Thanks, Brian Wickman