We've seeing issues where IO load appears to cause strange failures due to 
timeouts
within qemu. One theory for these is that it is is hitting hard page faults
at in-opportune moments which cause timing problems within the VM.

This patch is a bit of a hack which tries to ensure the data is paged in
at a point when we know we can take the time delays (waiting for the QMP
start signal). Whilst this isn't ideal, it does seem to improve things on
the autobuilder and shouldn't harm anything.

The code figures out which files to read my looking at the mmap'd files
the process has open from /proc. On Centos7 systems these files are not
user readable, if that is the case we just skip them.

Signed-off-by: Richard Purdie <[email protected]>
---
 meta/lib/oeqa/utils/qemurunner.py | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/meta/lib/oeqa/utils/qemurunner.py 
b/meta/lib/oeqa/utils/qemurunner.py
index 0032f6ed8dd..24af2fb20bb 100644
--- a/meta/lib/oeqa/utils/qemurunner.py
+++ b/meta/lib/oeqa/utils/qemurunner.py
@@ -342,7 +342,24 @@ class QemuRunner:
         finally:
             os.chdir(origpath)
 
-        # Release the qemu porcess to continue running
+        # We worry that mmap'd libraries may cause page faults which hang the 
qemu VM for periods
+        # causing failures. Before we "start" qemu, read through it's mapped 
files to try and 
+        # ensure we don't hit page faults later
+        mapdir = "/proc/" + str(self.qemupid) + "/map_files/"
+        try:
+            for f in os.listdir(mapdir):
+                linktarget = os.readlink(os.path.join(mapdir, f))
+                if not linktarget.startswith("/") or 
linktarget.startswith("/dev") or "deleted" in linktarget:
+                    continue
+                with open(linktarget, "rb") as readf:
+                    data = True
+                    while data:
+                        data = readf.read(4096)
+        # Centos7 doesn't allow us to read /map_files/
+        except PermissionError:
+            pass
+
+        # Release the qemu process to continue running
         self.run_monitor('cont')
 
         # We are alive: qemu is running
-- 
2.30.2

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#152833): 
https://lists.openembedded.org/g/openembedded-core/message/152833
Mute This Topic: https://lists.openembedded.org/mt/83456779/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to