Re: libc and Genode signal handlers, RPC

2017-12-20 Thread Christian Helmuth
Hello Steve,

On Sun, Dec 17, 2017 at 05:38:59PM -0600, Steven Harp wrote:
>   Release 17.02 introduced a new execution model of the C runtime
> which was, I believe, intended to make it possible to write components 
> that combine regular Genode signal handlers and RPC with code using libc.
> This would be very handy for porting network services etc that
> are greatly simplified by the C runtime.

We introduced the Libc Component and Select_handler API as a first
approach to support Genode components that use contrib implementations
from the POSIX environment. During the past month we gathered a lot of
experience and also identified some flaws of aspects of our first
take. For example, we're currently moving with_libc() from the public
API to the libc internals as it brought to much hassle. On the other
hand, we're quite certain the current execution model proved
practical.

When implementing a libc component the design must submit to the
execution model, which in particular assigns the initial entrypoint
the role of the I/O signal handler. The following aspects may be
relevant.

- There are several libc operations that may block the calling thread.
  Examples are read, select, and nanosleep.

- The initial entrypoint must pass a code section where it dispatches
  libc-relevant I/O signals, which in turn may result in unblocking of
  blocking libc operations. Those code sections are the entrypoint
  RPC-and-signal dispatcher (which is entered on return from an RPC
  function or a signal handler) and any blocking libc operation.
  (Searching the sources for calls to wait_and_dispatch_one_io_signal*()
  may help to get a picture.)

- Note, if the initial entrypoint is blocked in a libc operation
  (e.g., read(socket)) it only handles I/O signals but does not
  respond to RPC requests or other signals. Those are dispatched in
  the entrypoint dispatcher only.

- If the initial entrypoint is blocked by other means (blocking RPC to
  a service, semaphore, lock) or busy looping it will not dispatch I/O
  signals, thus not unblock libc operations (other threads in the same
  component may block in).

> Q:  Is there a preferred model for the interaction between the libc
> and non-libc parts of such a component?  For example, suppose we want
> to make the libc code block on a Genode signal (asynchronous event)?

Not yet. You may find different approaches throughout the sources or
in developers repositories. Indeed, we already continued the OpenVPN
discussion in the issue tracker [1].

> After reading discussion [1] I found that running the libc elements
> in a new Genode thread, and blocking on a Genode::Semaphore was a workable
> option, though perhaps not very elegant. The naive approach (using the 
> semaphore without the extra thread) seems to lead to a situation where
> signals are received but never unblock the libc code.

This situation is expected given the execution model described above:
If the initial entrypoint blocks without handling I/O signals other
threads never leave their blocking state.

> Can anyone recommend some examples/tests that illustrate the intended
> new approach for:
> (a) libc code blocking on a Genode signal,
> (b) a component's RPC service invoking some code using libc
> (c) libc code invoking Genode RPC
> (d) invoking libc code in Genode callback, e.g. timeout handler

I suggest you have a look into the libc_component test [2]. This test
served as debugging environment during development of the libc
execution model. If the test leaves some gaps, please feel free to
ask.

In addition to the descriptions above I'd like to note that the mix of
Genode and libc code bears a high risk of designing complex solutions
with unexpected corner cases. Please look into [1] where I sketched an
alternative implementation of the OpenVPN tun backend. I'm certain
that future integrations of libc and Genode (at least on the client
side) will need a twofold approach: a pseudo file system provided by a
Genode-facing VFS plugin and an application backend using this file
system. The advantage of this approach is that both parts may gain the
best out of their environment. For example, the application backend
may use select() to monitor a set of file descriptors while the VFS
plugin uses Genode APIs and signal handling. Finally, plugin and
application are coupled by the libc file API only. Currently, the most
complex pseudo file system is the socket_fs [3].


[1] https://github.com/genodelabs/genode/issues/2595
[2] 
https://github.com/genodelabs/genode/blob/17.11/repos/libports/src/test/libc_component/main.cc
[3] 
https://genode.org/documentation/release-notes/17.02#Linux_TCP_IP_stack_as_VFS_plugin

Regards
-- 
Christian Helmuth
Genode Labs

https://www.genode-labs.com/ · https://genode.org/
https://twitter.com/GenodeLabs · /ˈdʒiː.nəʊd/

Genode Labs GmbH · Amtsgericht Dresden · HRB 28424 · Sitz Dresden
Geschäftsführer: Dr.-Ing. Norman Feske, Christian Helmuth


Re: overhead of using multiple processes

2017-12-20 Thread Alexander Senier
Hi Ben,

IMO you're bringing up a very relevant and important question.

The file type identification could be severe for the users, as file
parsing and media rendering libraries do not have a particularly good
track record when it comes to security [1]. I guess this would not be as
dramatic in your envisioned architecture, in case another component
enforce read-only file access and results are exported through a report ROM.

Still, in the shared scenario we must assume the identification
component to run arbitrary attacker-controlled code if it accesses one
single malicious file. Then it could (a) attack its client through
malformed ROM content and (b) provide valid, but misleading file
information as you mention.

Case (a) could be mitigate through a trusted filter component between
the indexer and its client. Case (b), however, may trick the user into
treating one file type as another. Like, clicking on an executable
masqueraded as a harmless text file. I have no real good scenario on
that, yet, but it definitely feels bad.

It feels much worse when considering general-purpose image rendering or
font rendering. In that case, none of your output handled by that
central component would be trusted anymore. An attacker could trick you
into digitally singing a contract you had no intention to sign, spending
money etc.

I'd encourage you to actually *measure* the impact of file type
identification with one fresh process per file versus one central
component for all files it that's feasible. If the result is as poor as
we expect, it would be interesting to find ways to improve performance
while keeping separation.

To reflect a question back to the list: What would be a good concept to
support a scenario where a complex component (like Ben's file
identification or a heavyweight JVM) is preinitialized and used as a
boilerplate to quickly spawn independent child processes?

Cheers,
Alex


[1] https://nvd.nist.gov/vuln/detail/CVE-2017-11421

On 20.12.2017 03:16, Nobody III wrote:
> For example, a file manager would benefit from content-based file type
> identification. However, identifying files within the file manager (e.g.
> using libmagic) could pose a security risk. However, using a discrete
> component for this could have a noticeable effect on performance. If we
> use a single component instance for all of the files, the only
> significant added overhead would be from IPC, AFAIK. This seems to be
> acceptable in terms of performance, but a malicious file may be able to
> cause the component to misidentify other files in the directory, which
> could be a security risk. A more secure method would be to run an
> instance of the component for each individual file in the directory.
> However, this may substantially reduce performance for large
> directories, depending on the overhead of component creation. Would
> performance be an issue here? (And am I overestimating the risk of file
> misidentification?)
> 
> Similar cases include icon/thumbnail rendering, general-purpose image
> loading, and text rendering from many different sources. Would there be
> any notable differences for these?

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
genode-main mailing list
genode-main@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/genode-main