https://bz.apache.org/bugzilla/show_bug.cgi?id=69841

--- Comment #5 from Ákos Szőts <[email protected]> ---
I did four modifications within the patch:

1. #define LOCK_TIMEOUT 5000000
2. Left diff line 24 at "mutex busy"
3. Changed diff line 40 to "mutex busy-2"
4. Changed diff line 105 to "mutex busy-3"

This way I could differentiate where it gets stalled.

I used "strings" to verify that the deployed .so file contains all three
variants.

With 2 seconds timeout in the debug log I see 6 "mutex busy-2" logs. With 5
seconds timeout this is reduced to 3.

The page loads reduced to a crawl with these timeouts. The best are when it's
zero. However, I believe 0 being the fastest should be considered incorrect
behaviour.

My thoughts about the timeout and caching:

1. I think, theoretically, the LOCK_TIMEOUT can be an arbitrarily big value (~
<5 seconds). Because:
- If the initial computation or data retrieval takes /less/ time than that,
LOCK_TIMEOUT is never reached.
- If the initial computation or data retrieval takes /more/ time than that,
LOCK_TIMEOUT will be reached, but the computation still needs more time, so
LOCK_TIMEOUT won't be a limiting factor. In fact, defining a too small but
nonzero value hurts because the thread will be idling first, and only then
starts a longer computation. It's better to wait for others to finish first.
- The only problem happens if there's an error in the /less/ case, and
LOCK_TIMEOUT prevents an error like a deadlock. That should be very rare.

2. There should be two sides of caching: (I don't know if it's the case today)

- Caching of the clear text password encryption. The provided clear text
password goes through the crypt() call, and that should take ~0.5 second,
regardless of the current hardware. Better hardware, migrate to more difficult
encryption. This is against brute force on a modern machine.

This is where the slowness in my case comes currently. Apache is (probably)
running crypt separately for all 70 requests.

You can try it without DBD, just paste the output of this snippet in a
.htpasswd file:

python3 -c "from passlib.hash import sha512_crypt; import getpass;
print(sha512_crypt.hash(getpass.getpass('Enter password'), rounds=1000000))"

- The other side of caching is the "compare-to", encrypted password that
currently resides in the DB in my case (or in .htpasswd in other cases). This
retrieval is usually fast enough.


In my case there shouldn't be any mutex-busy-2 logs because both the crypt()
operation and the DB retrieval is maximum 0.5 second, way less than 5 seconds.
And while the mutexes wait for the very first request to compute crypt() and
call for the DB, all subsequent requests should use the cached version (where
the retrieval is again < 5 seconds).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to