Repository: aurora
Updated Branches:
  refs/heads/master c84d08146 -> 8383ed511


Speedup regular Thermos observer checkpoint refresh

Profiling indicates that a significant part of the refresh time os spend in 
`os.path.realpath`.
This was introduced in https://reviews.apache.org/r/35580/ to properly handle 
the `latest`
symlink in the Mesos folder layout.

This patch takes a slightly different approach to solve this problem based on 
`os.path.islink`.
The latter is faster as it just needs to look at a single folder rather than an 
entire path.

Testing Done:
I have tested this build on a node with 55 running tasks and 2004 finished ones.

Before this patch:

    D0320 22:20:44.887248 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.92s
    D0320 22:20:50.746316 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.93s
    D0320 22:20:56.590157 25771 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.89s

With this patch:

    D0320 22:18:53.545236 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.48s
    D0320 22:18:59.031919 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.49s
    D0320 22:19:04.512358 16250 task_observer.py:142] TaskObserver: finished 
checkpoint refresh in 0.48s

Reviewed at https://reviews.apache.org/r/66139/


Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/8383ed51
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/8383ed51
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/8383ed51

Branch: refs/heads/master
Commit: 8383ed511ce6eb7fe78a8a9cce5d08924408b7d2
Parents: c84d081
Author: Stephan Erb <[email protected]>
Authored: Thu Jun 14 13:02:48 2018 +0000
Committer: Stephan Erb <[email protected]>
Committed: Thu Jun 14 15:04:36 2018 +0200

----------------------------------------------------------------------
 .../apache/aurora/executor/common/path_detector.py   | 15 +++++++++++----
 .../aurora/executor/common/test_path_detector.py     |  7 +++----
 2 files changed, 14 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/aurora/blob/8383ed51/src/main/python/apache/aurora/executor/common/path_detector.py
----------------------------------------------------------------------
diff --git a/src/main/python/apache/aurora/executor/common/path_detector.py 
b/src/main/python/apache/aurora/executor/common/path_detector.py
index ed264d7..81f0d75 100644
--- a/src/main/python/apache/aurora/executor/common/path_detector.py
+++ b/src/main/python/apache/aurora/executor/common/path_detector.py
@@ -28,7 +28,14 @@ class MesosPathDetector(PathDetector):
     self._sandbox_path = sandbox_path
 
   def get_paths(self):
-    def iterate():
-      for scan_result in self._detector:
-        yield 
os.path.join(os.path.realpath(ExecutorDetector.path(scan_result)), 
self._sandbox_path)
-    return list(set(path for path in iterate() if os.path.exists(path)))
+    result = []
+
+    for scan_result in self._detector:
+      executor_path = ExecutorDetector.path(scan_result)
+      sandbox_path = os.path.join(executor_path, self._sandbox_path)
+
+      # We only care about the realpath of executors and not the additional 
`latest` link
+      if not os.path.islink(executor_path) and os.path.exists(sandbox_path):
+        result.append(sandbox_path)
+
+    return result

http://git-wip-us.apache.org/repos/asf/aurora/blob/8383ed51/src/test/python/apache/aurora/executor/common/test_path_detector.py
----------------------------------------------------------------------
diff --git 
a/src/test/python/apache/aurora/executor/common/test_path_detector.py 
b/src/test/python/apache/aurora/executor/common/test_path_detector.py
index 7b5ef0c..e048b42 100644
--- a/src/test/python/apache/aurora/executor/common/test_path_detector.py
+++ b/src/test/python/apache/aurora/executor/common/test_path_detector.py
@@ -41,8 +41,8 @@ def test_path_detector():
   )
 
   with mock.patch('glob.glob', return_value=(path1_symlink, path1, path2)) as 
glob:
-    with mock.patch('os.path.realpath', side_effect=(path1, path1, path2)) as 
realpath:
-      with mock.patch('os.path.exists', side_effect=(True, True, False)) as 
exists:
+    with mock.patch('os.path.islink', side_effect=(True, False, False)) as 
islink:
+      with mock.patch('os.path.exists', side_effect=(True, False)) as exists:
         mpd = MesosPathDetector(root=FAKE_ROOT, 
sandbox_path=FAKE_CHECKPOINT_DIR)
         paths = list(mpd.get_paths())
         assert len(paths) == 1
@@ -56,13 +56,12 @@ def test_path_detector():
           'run': '*'
         }
         assert glob.mock_calls == [mock.call(expected_glob_pattern)]
-        assert realpath.mock_calls == [
+        assert islink.mock_calls == [
             mock.call(os.path.join(path1_symlink)),
             mock.call(os.path.join(path1)),
             mock.call(os.path.join(path2)),
         ]
         assert exists.mock_calls == [
             mock.call(os.path.join(path1, FAKE_CHECKPOINT_DIR)),
-            mock.call(os.path.join(path1, FAKE_CHECKPOINT_DIR)),
             mock.call(os.path.join(path2, FAKE_CHECKPOINT_DIR)),
         ]

Reply via email to