first steps in the Hurd code

Marcus Brinkmann Thu, 21 Aug 2003 18:06:23 -0700

Hi,

I am moving this thread to [EMAIL PROTECTED]

On Thu, Aug 21, 2003 at 01:13:43AM +0100, Greg Buchholz wrote:
> On Wed, 20 Aug 2003, Patrick Strasser wrote:
> 
> > I'd like to write a document in this style. I think I'm quite qualified
> > for this task, as I do not really know where to start and how things
> > work. Someone else to join the party?
> > Perhaps someone is willing to ask some guiding question?
> 
>     We'll here's one starting point for you.  I've wondered about the
> chain of events that occur when a program makes a call to read().  And
> here I think that hello.c from the hurd source would be instructive
> (so I've included it parts of it below with my own questions/comments
> in C++ // style comments).  Here's my best guess as to what happens
> (which is probably --make that surely-- way off the mark).

It's a bit off, but already more than half there.

> A call to
> read() is really a wrapper around another glibc function called
> io_read(). 

Note that for this, you must first map the file descriptor to a Hurd object,
which is currently a mach_port_t.  A mach port is sort of a capability, or
remote object (really: a send right for an in kernel message queue here).
I am trying to give bits of the abstract view and bits of the actual
implementation here.

So read() is found in libc/sysdeps/mach/hurd/read.c, and maps to
_hurd_fd_read() which is in libc/hurd/fd-read.c.  How do I know this? 
Experience, but cross-referenced source codes (ETAGS, Patrick's web site)
can help you a bit.  Or even gdb and "s" to step through it.  Or looking
into the assembler code of libc ;).

_hurd_fd_read() is finally a wrapper around io_read, which also has to care
about controlling terminals, something I would happily ignore on first
reading.  Eventually, it does indeed do the io_read() you were expecting.

However, the contacting of the filesystem with the filename as you mentioned
it has to be done at some time of course.  This is however at open() time
already.  After that, all messages use the object as provided by the correct
translator directly (ie, you don't do the filename lookup for every RPC).

> This function sends a message to the root translator with
> the name of the target file and a request for a handle to a subroutine
> which will be invoked to get the actual data.

The message is dir_lookup (defined in hurd/hurd/fs.defs), but it does not
request a handle for the interface, but a handle for the object (a
mach_port_t).  The interface you want to invoke for that object is
identified by a msgid (for example, io_read has the msgid 21001, because it
is the second interface in hurd/hurd/io.defs, and hurd/hurd/io.defs has the
"subsystem io 21000" line in it).  So you would do the dir_lookup and then
get a mach_port_t.  With that port, you send a message with the message id
21001 to do the io_read.  Of course we provide client RPC stub io_read() to
do that conveniently.

BTW, dir_lookup also is just an interface implemented by an object, the
directory object you start the lookup from.  This is relevant for relative
paths, and of course for chroot environments.

> The root translator
> checks the name and sees that it belongs to the ext2fs translator.

The root translator is the ext2fs translator.  It is the first translator to
see absolute path names.

> The
> root translator then sends a message to the ext2fs translator to try
> and get to the bottom of the situation.  The ext2fs translator sees
> that the file is really controled by a translator called hello.

Right.  It would then start up the translator if it doesn't run yet
(assuming you use a passive translator setup).

>  And
> a message is passed back up the chain until io_read get a pointer to
> the function trivfs_S_io_read.

Not in the way you think.  Well, first we already established that io_read
is not done this way, but even for dir_lookup it is different.  The
ext2fs filesystem can not do the lookup itself at the hello translator,
because its permission might be different than the users permissions.  So
the user has to do the lookup itself.  So ext2fs returns dir_lookup with
"do_retry" being FS_RETRY_NORMAL, "retry_name" the name of the file to open
_relative to the hello translator_ and "result" being the root directory
port of the hello translator :)  This combination tells the glibc in the
client that it now has to send dir_lookup to the hello translator.

This is a bit like table tennis, with the client on the one side and all
servers on the other side.  For more details, see my talk available online
at the Hurd web site:

http://www.gnu.org/software/hurd/hurd-talk.html#how

> Now we're starting to get some where.
> io_read() then grabs some memory (from which process space I'm not
> sure) and tells Mach, "make this call to trivfs_S_io_read for me and
> if you get an answer back, shove the data in this chunk of memory".

In fact, I think that io_read doesn't do any preallocation of the memory,
but leaves it up to the server.  It could, but that would in fact just be
slower (this will change somewhat in L4).   The server allocates pages from
the kernel (it can't use malloc because the pages must be page aligned), and
then writes the data into it.  The pages then are transferred in the reply
message.

Passing memory out of line is part of Mach's IPC system.  So Mach actually
maps the pages out of the server process, maps them into the client process,
and so no additional work on that is necessary.

_hurd_fd_read must then copy the data to the user buffer.  The reason it
does check first if the user buffer wasn't used is that under some
circumstances, the user's buffer could be used directly.  But that is only a
detail of Mach's IPC system and how memory is exchanged in messages using it.
You can read all the details about this in the Mach docs.

> Mach then dutifully calls the approriate function which fills our
> little buffer of memory with the correct bytes.

Or a new buffer :)

> After the rpc
> returns, io_read() examines the contents of the message from Mach
> and if everything went swell, it copies the data into the original
> buffer which was initially passed to it from read().

Right.  _hurd_fd_read does that.

>  I think most of my confusion comes from how Mach passes messages
> back and forth and who controls what memory.  (I probably need a
> *beginers* guide to Mach IPC, if someone could point me in right
> direction).

Maybe check out
http://www.gnu.org/software/hurd/hacking-guide/hhg.html

It's not about the Mach details, but covers the main idioms that you need to
know when writing Hurd code.  You don't really want to know the details on
this if you can avoid it :)

> I've also wonder how much of the Machisms will remain
> after the transition to L4.

Memory exchange will be vastly different.  The object/interfaces with msgid
system will remain, although the IPC system in general is very different.

We will not use mach_port_t, but a new type hurd_cap_t.  Such things.  But
the abstract concepts will remain, and the protocols like dir_lookup will
mostly stay as they are (with a few exceptions).

> Of course its also obvious that I don't
> understand what translators get involved when a file name is trying
> to be resolved, so I'd be happy for any info about that area.

This is in my talk.

> And here are my related questions about hello.c and trivfs_S_io_read...
> 
> error_t
> trivfs_S_io_read (struct trivfs_protid *cred,
>           mach_port_t reply, mach_msg_type_name_t reply_type,
>           char **data, mach_msg_type_number_t *data_len,
>           loff_t offs, mach_msg_type_number_t amount)
> {
>   struct open *op;
> 
>   /* Deny access if they have bad credentials. */
>   if (! cred)
>     return EOPNOTSUPP;
>   else if (! (cred->po->openmodes & O_READ))
>     return EBADF;
> //What exactly are credentials?  Are they Mach port permissions?

In any server side of the RPC, we map the port to the object the port
represents, because this is what you usually want (the port number itself is
useless).  This is done in "mutations" as a MiG feature.  grep the Hurd
source for mutations and after a while you will see what I mean.

The credential is the libtrivfs object that is behind every filesystem port,
for example behind the fd that you get when you open the hello world node.
The credential contains information about the caller of the function, ie,
which user ids he has, which open mode flags were used etc.

Now if CRED is NULL, then that means that the mutation didn't work.  For
this you must know that for any message with this msg id that comes in, the
above function will be called.  This must not necessary be a filesystem
node, it could also be a port for another service trivfs provides.  If the
port is not a valid filesystem port, the mutation will return NULL, and so
the first argument is NULL.  In this case, we reject the message.

> //Are they related to not-logged in users?  Surely it has
> //nothing to do with file permissions.  Right?  What happens
> //if I don't deny a caller with bad credentials.

Then you are implementing io_read for ports that are not the normal
filesystem ports, but ports of some different type.  In this case, it would
be a good idea to extend the mutation so that you get back the object behind
that port, otherwise you have no information.

For some interfaces, it might not matter at all (although I disagree with
that), and you can always just do the operation requested.  But this is the
exception.

>  Is this a
> //security risk? 

Likely, at least a potential DoS attack.

> If I'm always supposed to deny access to callers
> //with bad credentials, why doesn't libtrivfs do this for me?
> //I though it was a convience wrapper.

If it were to do that, you'd have another function call in the RPC path just
for that.

>   /* Get the offset. */
>   op = cred->po->hook;
>   if (offs == -1)
>     offs = op->offs;
> 
>   /* Prune the amount they want to read. */
>   if (offs > contents_len)
>     offs = contents_len;
>   if (offs + amount > contents_len)
>     amount = contents_len - offs;
> 
>   if (amount > 0)
>     {
>       /* Possibly allocate a new buffer. */
>       if (*data_len < amount)
>     {
>       *data = mmap (0, amount, PROT_READ|PROT_WRITE, MAP_ANON, 0, 0);
>       if (*data == MAP_FAILED)
>         return ENOMEM;
>     }
> //I'm assuming that we might have to allocate more memory because
> //Mach messages have a default size which is not big enough and we
> //don't want to overflow a buffer.

Actually, it's MiG, I think.

> But what happens if I ignore
> //this and write beyond my boundaries.  Who suffers? 

The server will segfault, or other bad things happen.

> Does just this
> //translator die?  Or do I take others down with me?

What could happen is that you give out secret data (you were to write) in a
neighboured buffer that is then returned to some other user, so you might
die (and then all your users will be notified by your death at some point)
or you will not die and just behave wrong.  But it is always your fault then :)

>  I know that
> //if the mmap is replaced with a malloc or an automatic array, that I
> //get a 16MB core dump when the segv occurs  And I don't think the
> //hello translator is 16MB in size.
> //And I a bit confused as to who eventually deallocates this memory.
> //Is it Mach?  glibc?

It's Mach.  That's why it has to be full pages.  The reason Mach does this
is because a special flag is set in the reply message.  You can see this in
the definition from hurd/hurd/io.defs:

routine io_read (
        io_object: io_t;
        RPT
        out data: data_t, dealloc;
                         ^^^^^^^^
        offset: loff_t;
        amount: vm_size_t);

The "dealloc" is responsible for this.

>   And I've can pretty much generate an almost endless supply of
> newbie questions if you need more :)

My question is, why didn't you ask before?  Surely you have not thought of
them today for the first time.  I remember I struggled with exactly the same
questions for a long time, and didn't dare to ask what a credential is :)
But that was when only Roland and Thomas knew the answer and annoyed them
already very much with other things.  Today we have several people who know
all this, and maybe at some point someone will even write it down.

Thanks,
Marcus

-- 
`Rhubarb is no Egyptian god.' GNU      http://www.gnu.org    [EMAIL PROTECTED]
Marcus Brinkmann              The Hurd http://www.gnu.org/software/hurd/
[EMAIL PROTECTED]
http://www.marcus-brinkmann.de/

_______________________________________________
Help-hurd mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/help-hurd

first steps in the Hurd code

Reply via email to