On 12/04/2015 12:42 PM, Lukas Slebodnik wrote:
On (03/12/15 20:22), Jakub Hrozek wrote:
On Wed, Dec 02, 2015 at 05:59:07PM +0100, Lukas Slebodnik wrote:
On (02/12/15 17:10), Michal Židek wrote:
Hi!

I saw some integration tests failures recently,
and I think there is a race condition between the
enumeration refresh timeout and the sleeps
after some operations that wait for this timeout.
SSSD fails to populate changes from LDAP in time
and some asserts can fail because of this.

So far I saw 4 tests to fail like this, which
is already quite a lot.

The attached patch modifies the timeout values
and hopefully removes the issue.

Michal

>From b724db15ce0c1593cfdd7b4da8e0c39e97942e8c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20=C5=BDidek?= <mzi...@redhat.com>
Date: Wed, 2 Dec 2015 16:44:48 +0100
Subject: [PATCH] ldap_test.py: Modify enum cache timeouts

There is a race condation between ldap
enumeration refresh timeout and the sleeps
that wait for the ldap changes to populate
to SSSD if the timeout and the sleeps have
the same value.
---
src/tests/intg/ldap_test.py | 30 +++++++++++++++++-------------
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/tests/intg/ldap_test.py b/src/tests/intg/ldap_test.py
index 757ee20..8ec8dbe 100644
--- a/src/tests/intg/ldap_test.py
+++ b/src/tests/intg/ldap_test.py
@@ -33,7 +33,11 @@ import ldap_ent
from util import *

LDAP_BASE_DN = "dc=example,dc=com"
-INTERACTIVE_TIMEOUT = 4
+INTERACTIVE_TIMEOUT = 2
+
+
+def wait_for_ldap_enum_refresh():
+    time.sleep(INTERACTIVE_TIMEOUT + 4)
Why does it need to be INTERACTIVE_TIMEOUT + 4

Could it be INTERACTIVE_TIMEOUT + 3 or + 5


Regardless of the value we choose, can we move this patch forward? I see
the related failure quite often in SSSD.
Adding timeout without real explanation is not a solution.

The main problem is with empiric values.
If they are very high then test are slow.
And there still can be slow/fast machine where lower values caused troubles.

The ideal solution would be to get rid of enumeration
in ldap tests. If we want to test enumeration than it should be in separate
test. I'm not sure we would be able to get rid of "sleep()"
in enumeration test but all such values should pre properly documented why it
is big enough ...

Yeah, these timeouts are messy.

The problem is not the empiric timeout values themselves but rather the
guesswork of when certain cache state changes are supposed to happen in
relation to them. If we can reason about event deadlines then we can use them.

If not, and the code is too complicated, can we perhaps introduce some
explicit synchronization with sssd caching mechanism, where sssd behavior will
drive the tests? E.g. the tests would actually wait for sssd to do something
with the cache and after sssd reports it is done, the tests will verify the
time and the result?

We would still get to test that sssd does something we need in the expected
timeframe, but we can make the tests faster - i.e. as fast as sssd can perform
and be configured to.

Perhaps add something to the sss_cache tool?

Nick
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

Reply via email to