----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58979/ -----------------------------------------------------------
Review request for Aurora, David McLaughlin, Stephan Erb, and Zameer Manji. Repository: aurora Description ------- `LogStorage` exposes aggregated stats on its write lock wait time (`log_storage_write_lock_wait`). However, this does not help much with performance studies where a breakdown of wait time by caller is required. This patch exposes stats on the writer processes trying to acquire `LogStorage` write lock. Metric collection has been put behind a feature flag (`enable_log_storage_lock_wait_tracking`) to ensure it does not depreciate scheduler performance (although the induced overhead should be minimal). It is disabled by default but can be enabled for performance studies or in test clusters. Diffs ----- src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 387350c7667a5fb8ee674ad0d3dd17529232b25b src/main/java/org/apache/aurora/scheduler/storage/log/LogStorageModule.java 835f1604c0c5d913a87d570ee01d053bbbf92ecb src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 0eb54fdaddfbc2af76fd83ffee18ce4c6b61cc48 Diff: https://reviews.apache.org/r/58979/diff/1/ Testing ------- - Manually under Vagrant - end to end test script ``` $ curl -s localhost:8081/vars | egrep -e 'log_storage_write_lock_wait_for_.*_nanos_total ' log_storage_write_lock_wait_for_org.apache.aurora.scheduler.BatchWorker.processBatch_BatchWorker.java_207__nanos_total 81407978 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.SchedulerLifecycle_4.accept_SchedulerLifecycle.java_226__nanos_total 2337 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.handleRegistration_MesosCallbackHandler.java_182__nanos_total 3257 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.lambda_handleOffers_3_MesosCallbackHandler.java_206__nanos_total 3628 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.Storage_MutateWork_NoResult.apply_Storage.java_152__nanos_total 2101 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.db.RowGarbageCollector.runOneIteration_RowGarbageCollector.java_83__nanos_total 79698582 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage.replay_LogStorage.java_469__nanos_total 172993 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage__EnhancerByGuice__924dd57b.CGLIB_start_6__generated___nanos_total 3323 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.systemResume_JobUpdateControllerImpl.java_302__nanos_total 1928 curl -s localhost:8081/vars | egrep -e 'log_storage_write_lock_wait_for_.*_events ' log_storage_write_lock_wait_for_org.apache.aurora.scheduler.BatchWorker.processBatch_BatchWorker.java_207__events 1 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.SchedulerLifecycle_4.accept_SchedulerLifecycle.java_226__events 1 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.handleRegistration_MesosCallbackHandler.java_182__events 1 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.mesos.MesosCallbackHandler_MesosCallbackHandlerImpl.lambda_handleOffers_3_MesosCallbackHandler.java_206__events 1 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.Storage_MutateWork_NoResult.apply_Storage.java_152__events 1 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.db.RowGarbageCollector.runOneIteration_RowGarbageCollector.java_83__events 2 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage.replay_LogStorage.java_469__events 35 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.storage.log.LogStorage__EnhancerByGuice__924dd57b.CGLIB_start_6__generated___events 1 log_storage_write_lock_wait_for_org.apache.aurora.scheduler.updater.JobUpdateControllerImpl.systemResume_JobUpdateControllerImpl.java_302__events 1 ``` Thanks, Mehrdad Nurolahzade
