[HelenOS-devel] Starving IPC in the async framework

Vojtech Horky Mon, 09 Apr 2012 13:16:00 -0700

Hello,
I think I encountered a little problem within the async framework and
I would be glad if someone better versed with the framework could look
at the symptoms and at the proposed solution.


The problem can be reproduced in the following scenario. The program
is single-thread, single-fibril (i.e. no fibrils are explicitly
started in the program) and does some intensive computation. However,
we want to have the ability to pause the computation and thus the
program tests whether some key was pressed. The code can look like
this. There is module `starve' for tester in starve_demo.patch.

while (...) {
        /* Do some work */

        if (console_get_kbd_event_timeout(console, &ev, 0)) {
                break;
        }
}

But if you ran the test, pressing a key would have no effect. However,
the key event would be there queued - you can verify this by running
ipc tester_task from kernel console after you start the test and press
a key. Bottom line - this is not a problem with the console but solely
with the application.

I did some investigating and I think I found a problematic spot in the
framework. Patch starve_possible_fix.patch is my attempt for solving
it. For sake of clarity, I will describe what made my create this fix.

The problem is apparently around the async_wait_timeout() because it
is clear that the IPC call was send and reply was already received.
This function tests whether the answer already arrived. If the answer
is there it returns immediately. Otherwise, it creates a timeout event
and fibril is suspended. So far, good.

The timeout events are processed by a manager fibril in
async_manager_worker(). And I think there is the cause of the trouble.
The function first tests whether there aren't some expired timeouts
and handles them and then it switches to another ready fibril. Which
means that zero timeout would be processed here (including the test
for key press) and async_wait_timeout() would return.

BUT, when expired timeout is handled, the function does not try to
test whether there are some incoming IPC calls at all. Thus, the
answer would still wait in kernel. If the program is single-threaded
and does not do any other IPC, the answer would never be retrieved. In
other words, as long as there is an expired timeout, no IPC would get
through.

Thus, the idea is to check for waiting IPC even if expired timeouts
were processed. My biggest concern is whether my fix would not break
something else. It seems to me that the current implementation is
trying to be fair: once something was done (expired timeouts were
handled), the manager fibril switches to a normal fibril to give
chance to other fibrils to run.

I would be glad for any observations on this matter.

Thanks.

- Vojta

starve_demo.patch
Description: Binary data

starve_possible_fix.patch
Description: Binary data

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/cgi-bin/listinfo/helenos-devel

[HelenOS-devel] Starving IPC in the async framework

Reply via email to