Norman: Thank you for the very thoughtful explanation. I found no useful examples of cancel_blocking in core Genode, so perhaps it is safe enough to remove, at least from the documentation; or, emit a warning in the implementations.
The final sentence of Section 4.7.6 "Enslaving services" is the one that suggested my experiment (17.05 edition). Possibly I took this out of context. Overall the Foundations book is of excellent quality, and sets a standard for systems of this type. Arranging component relationships so that client and server correspond to a natural asymmetry of trustworthiness is sometimes straightforward, but sometimes ambiguous. E.g. should one trust calls to a log service? What if the log service gets upgraded to log to a network host that falls under control of an attacker? The attacker exploits a vulnerability and owns the logger; some critical component then halts the next time it issues a logging call. Yes, you can e.g. redesign the logger as a client--I've done this, but it adds to the complexity of other components. In some cases, RPC might not be the most natural communications solution. Is asynchronous message-passing (using only signals and shared memory) feasible in Genode? Maybe something similar to "vchan" in Xen/Qubes. Perhaps this exists? // Steve On 10/17/2017 04:32 AM, Norman Feske wrote: > Hi Steve, > > On 12.10.2017 00:26, Steven Harp wrote: >> The Genode book suggests that an RPC caller can protect itself from blocking >> in a stalled server by creating a watchdog thread to monitor the process of >> the call, and cancel it if it takes too long. >> >> Is there a robust/canonical example of using cancel_blocking in this way? > > I am afraid that the book misled you towards an outdated direction. The > cancel-blocking mechanism was introduced very early at a time when we > routinely designed inter-component interfaces that were blocking at the > server side. At that time, L4 kernels did not support any means of > asynchronous notifications, thereby luring us into this direction. > Later, we realized this mistake and successively redesigned the > interfaces [1] to use a combination of synchronous RPCs that immediately > return and asynchronous notifications for blocking at the client side. > We announced this transition in May last year [2] and finished it in May > this year. > > [1] https://genode.org/documentation/release-notes/13.02#Timer interface > turned into asynchronous mode of operation > [2] > https://genode.org/documentation/release-notes/16.05#The_great_API_renovation > > For modern components, the cancel-blocking mechanism is no longer used. > We still keep it around to uphold compatibility but I hope to eventually > remove it from the API in the not-too-distant future. > >> My experiment with this (Genode 17.08, x86_32) seems to work as >> expected--but, >> only with the OKL4 kernel!? With nova, hw, and seL4, the cancel_blocking() >> method executes but seemingly to no effect: the thread continues to wait on >> the (contrived) very slow RPC call, which eventually completes. >> >> Suggestions? > > When a client calls a server, it ultimately yields the flow of control > to the server until the server replies. Because a misbehaving server may > never reply, e.g., because of a bug, the client could get stuck at that > point. There is no counter measure for this situation. We found that > potential counter measures like IPC timeouts or the cancel-blocking > mechanism that are intuitively tempting are bug prone and lead to > indeterministic system behavior. A client unconditionally expects that > the server replies to an RPC request. From a client's perspective, a > server called via RPC functions is similar to a regular third-party > library. When calling a library function, one can never be sure that the > function will eventually return. It could get stuck in the library. > Therefore, we devise the best practice to implement complex (bug-prone) > software as mere clients, not servers. Please consider Section 3.2.4. > "Client-server relationship" of the book for a succinct characterization > of the client and server roles within Genode. > > The canonical example of this best practice is the window manager, which > is a composition of the low-complexity 'wm' component (that acts as a > server) and the potentially high-complexity (and more bug prone) > layouter and window decorator components. The latter two components are > mere clients of the 'wm' server. Another good example is the way how the > (trusted) report_rom server decouples the producers and consumers of > state information. Both the producer ('Report' session client) and > consumer ('ROM' session client) are clients of the report_rom server. > They both trust the report_rom server but they don't need to trust each > other, nor does the report_rom server need to trust any of them. > > Note that throughout Genode, there are still several places where we > don't fully adhere to this practice yet. I.e., NIC drivers (like the > highly complex wifi_drv) still act as servers. But we will ultimately > change this in a way that NIC drivers will become clients of the low > complexity nic_router component. > > When following this route, there is no need for the cancel-blocking > mechanism. Your observation that the cancel-blocking mechanism works for > RPCs on OKL4 is just an artifact from the past. > > Sorry that the book guided you in the wrong direction. Could you please > point me to the particular part so that I can revise it? > > Cheers > Norman > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ genode-main mailing list genode-main@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/genode-main