Repository: aurora
Updated Branches:
refs/heads/master 3ea0331d4 -> 4bc524614
Improve scheduling throughput via logging changes.
This patch makes two logging performance changes.
First, it reduces the cost of logging by replacing the costly class and line
patterns with the cheaper logger pattern. We lose line numbers and inner class
information for much cheaper logging.
Before
````
I1201 15:08:40.560 [AsyncProcessor-0, StateMachine$Builder:389]
SchedulerLifecycle state machine transition LEADER_AWAITING_REGISTRATION ->
ACTIVE
````
After
````
I1201 15:06:47.181 [AsyncProcessor-0, StateMachine] SchedulerLifecycle state
machine transition LEADER_AWAITING_REGISTRATION -> ACTIVE
````
Second, it reduces the verbosity of the `TaskStateMachine` logging where it logs
the work command from the transition. I don't think there is any operator value
in logging this (unlike the task state transitions) so I have lowered it to
debug.
Performance Before:
````
Benchmark (numPendingTasks)
(numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A
1000 thrpt 10 2.510 ± 0.557 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A
10000 thrpt 10 0.272 ± 0.030 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A
50000 thrpt 10 0.053 ± 0.011 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000
N/A thrpt 10 2.446 ± 0.698 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000
N/A thrpt 10 0.246 ± 0.018 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000
N/A thrpt 10 0.041 ± 0.006 ops/s
````
Performance After:
````
Benchmark (numPendingTasks)
(numTasksToDelete) Mode Cnt Score Error Units
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A
1000 thrpt 10 14.520 ± 5.696 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A
10000 thrpt 10 1.290 ± 0.361 ops/s
StateManagerBenchmarks.DeleteTasksBenchmark.run N/A
50000 thrpt 10 0.254 ± 0.097 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 1000
N/A thrpt 5 7.303 ± 5.662 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 10000
N/A thrpt 5 0.726 ± 0.624 ops/s
StateManagerBenchmarks.InsertPendingTasksBenchmark.run 50000
N/A thrpt 5 0.124 ± 0.058 ops/s
````
There is a performance improvement in the smaller case and no noticable
degredation in the larger cases. I also verified on a small cluster that the
improvements exist for the larger cases. I am unable to reduce the error bars
locally on the `InsertPendingTasksBenchmark`. I suspect the bencmark needs to be
tweaked to be more consistent.
Bugs closed: AURORA-1831
Reviewed at https://reviews.apache.org/r/54269/
Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/4bc52461
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/4bc52461
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/4bc52461
Branch: refs/heads/master
Commit: 4bc5246149f296b14dc520bedd71747fdb2578fb
Parents: 3ea0331
Author: Zameer Manji <[email protected]>
Authored: Fri Dec 2 14:13:08 2016 -0800
Committer: Zameer Manji <[email protected]>
Committed: Fri Dec 2 14:13:08 2016 -0800
----------------------------------------------------------------------
RELEASE-NOTES.md | 2 ++
.../org/apache/aurora/scheduler/state/TaskStateMachine.java | 2 +-
src/main/resources/logback.xml | 5 ++++-
3 files changed, 7 insertions(+), 2 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/aurora/blob/4bc52461/RELEASE-NOTES.md
----------------------------------------------------------------------
diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md
index 7a3d331..90c4793 100644
--- a/RELEASE-NOTES.md
+++ b/RELEASE-NOTES.md
@@ -20,6 +20,8 @@
can now rely purely on health checks rather than `watch_secs` timeout when
deciding an individial
instance update state, by setting `watch_secs` to 0. A service will remain
in `STARTING` state
util `min_consecutive_successes` consecutive health checks have passed.
+- The default logging output has been changed to remove line numbers and inner
class information in
+ exchange for faster logging.
### Deprecations and removals:
http://git-wip-us.apache.org/repos/asf/aurora/blob/4bc52461/src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java
----------------------------------------------------------------------
diff --git
a/src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java
b/src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java
index 23f256a..eb4fe7d 100644
--- a/src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java
+++ b/src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java
@@ -471,7 +471,7 @@ class TaskStateMachine {
}
private void addFollowup(SideEffect sideEffect) {
- LOG.info("Adding work command " + sideEffect + " for " + this);
+ LOG.debug("Adding work command {} for {}", sideEffect, this);
sideEffects.add(sideEffect);
}
http://git-wip-us.apache.org/repos/asf/aurora/blob/4bc52461/src/main/resources/logback.xml
----------------------------------------------------------------------
diff --git a/src/main/resources/logback.xml b/src/main/resources/logback.xml
index 84c175c..f4a8fa1 100644
--- a/src/main/resources/logback.xml
+++ b/src/main/resources/logback.xml
@@ -22,8 +22,11 @@ limitations under the License.
<appender name="STDERR" class="ch.qos.logback.core.ConsoleAppender">
<target>System.err</target>
<encoder>
+ <!-- Before changing this, consult
http://logback.qos.ch/manual/layouts.html and make sure
+ the parameter is not expensive. Once changed, run benchmarks to ensure
that throughput is
+ the same !-->
<pattern>
- %.-1level%date{MMdd HH:mm:ss.SSS} [%thread, %class{0}:%line] %message
%xThrowable%n
+ %.-1level%date{MMdd HH:mm:ss.SSS} [%thread, %logger{0}] %message
%xThrowable%n
</pattern>
</encoder>
</appender>