Re: RFC related to reading the script from an arbitrary file descriptor

Ciprian Dorin Craciun Sat, 01 Jun 2013 11:04:32 -0700

On Sat, Jun 1, 2013 at 7:12 PM, Thorsten Glaser <[email protected]> wrote:
>> would behave like `-s`, but which would instead have an argument
>> specifying the file descriptor number to read the script from (instead
>> of stdin).
>
> I wonder at the use cases for this. Sounds pretty marginal to me,
> but maybe you can explain on it a bit.
>
> [...]
>
> I admit I’d like to find a solution that doesn’t involve patching
> mksh first; if you’re okay with that, that is.



    The feature I'm proposing can have two very broad use-cases.  (I
haven't fully implemented any of them but they are on my to-thing and
to-implement list.  However I've toyed with both.)


    (A) Imagine a tool that (from one reason or the other) stores
snippets of scripts (and various data files) in a database file (like
SQLite or CDB), which could be invoked as (`do scriptlet
some-arguments`).

    (I've documented thin kind of application at this link:
http://wiki.volution.ro/Projects/VolutionDo .)


    (B) An early initrd boot system that doesn't have `/proc` and
`/dev` mounted, or even a writable `/tmp`, and which cleans every
trace of its initial contents from rootfs, replaces it with another
rootfs, and then instead of `init` executes a script that was in the
initial initrd.  (Thus a kind of high level Linux "chainloader",
usable in cloud environments.)


    These use-cases can be solved as follows:

    * (preferred) in case (A) the launcher creates a pipe, forces the
OS to allocate for it a large enough buffer (up to 1MiB is allowed in
Linux), writes the data there, closes the write side, and `execve`'s
mksh; (thus the script doesn't need to exist at all nowhere on the
file system;)

    * in case (A) we could use `/tmp` (of course based on `tmpfs` to
reduce disk overhead), open the file, write to it, unlink-it (thus a
dangling inode until we close it), then give the file descriptor to
`mksh` via `/dev/fd/x`;  however this technique is prone to various
race conditions and security issues, leaking the text of the script,
etc.;

    * in case (B) the launcher opens the script, unlinks it, and gives
it to `mksh`; (thus just like in the previous;)


    As I and you have observed, all these can be solved without
touching `mksh` code, however the solution seems "almost" right.  To
be fair I don't know of any other interpreter that takes an argument
denoting a file descriptor (without the `/dev/fd/x` trick).  Moreover
my use-cases are very specific and obscure to justify this new
feature.  However I think it could be a nice to have feature if the
implementation isn't too costly.


    I'll also reply in-line below...


On Sat, Jun 1, 2013 at 7:12 PM, Thorsten Glaser <[email protected]> wrote:
>
>>Moreover, just like in `-c` case, the following extra
>>arguments are treaded like `$0`, and so forth.
>
> JFYI, -s already behaves like that (except the first argument is
> used as $1 not $0).

    The problem is that some scripts use `$0` to change their
behaviour, thus they would break if they are run with the `-s`
feature.


>>    My RFC is the following: how about having a flag, say `-S`, that
>
> Top-level shell options are shared with the set builtin, and I’m
> *very* reluctant to add any more. Let’s discuss this a bit more.

    I know that these options are shared with `set`, however from the
man-page there doesn't seem to exist a `-S` flag even there.

    But fair enough, adding a one-letter flag for such a corner case
could break future compatibility.  However there could be a
long-option only such as `--script-fd` which is highly unlikely to
conflict with something else.


>>    * using `mksh /dev/fd/5`, but which unfortunately puts `$0` as
>>`5`, and must be overridden in the script;
>
> What do you want in $0 instead? Following -s, I’d put mksh there.

    In case of `mksh /dev/fd/5` indeed it should be `5` (i.e. `/dev`
shouldn't be treated specially).

    However as said some scripts could need their `$0` to a proper
value, thus I would have wanted something like:

      mksh --script-fd 5 tool-x argument-1 ...


>>    * using `mksh -c '. /dev/fd/5' script-name`, which seems clumsy;
>
> It may seem clumsy but is probably perfectly serviceable…

    The main problem is that it is quite cryptic in what it actually
does.  Thus a person reading it would wonder what happens there, and
it requires the user to understand all `-c`, `.`, and `/dev/fd/5`
meaning.  Meanwhile the new flag could have explicit documentation in
the man-page.


    The second problem is that they rely on `/dev`, `/proc`, and
`/tmp` being mounted.


>>    * using `mksh -s` and playing with duplicating file descriptors;
>>`mksh` reads the script directly from `0`, which means that if I
>>replace `0` inside the script `mksh` will start reading there instead,
>>thus making impossible to have a proper stdin;
>
> Right. This was my first idea, but it’s not a working solution
> to your problem for that reason, indeed. I occasionally fall
> into that trap too… echo read foo | mksh -s…

    About this, I think it's a potential source of bugs and strange
errors...  I would have personally preferred that the shell `dup`'s
the stdin descriptor and read from there, then open `/dev/null` on
stdin so that the script doesn't misbehave and is free to replace
stdin without breaking anything.  (I don't know what POSIX has to say
in this respect, but `bash` behaves just like `mksh`, thus I suspect
it's a conscious design.)


>>    However (most) all of them require the use of `/dev/fd` (or
>>`/proc/self/fd`), which wouldn't work in case these two file systems
>>aren't mounted, such as is the case in early booting where I intend to
>>use this feature.
>
> I see. Maybe we can find another way around this yet…

    There is no way around using the `/dev/fd/x` trick.


> do you
> have any concrete use case for it yet, maybe something already
> written that uses it?

    (See the use-cases at the beginning of this email.)


> I *was* thinking of extending the “exec” builtin to be able to
> set argv[0] to some random string, too… but this would, again,
> require /dev/fd.

    The extension could be useful, I've found once or twice a case
where this could be useful (mainly it involved chain-loading another
script from within a script.)

    Why would this extension require `/dev/fd`?


> If you’ve got tmpfs (from you mentioning /proc
> I assume you’re on GNU/Linux?), we can use that.

    There are two problems with `tmpfs` that I don't like:

    (A) Its a security problem, because it must be handled with care.

    (B) Many distributions still don't mount `tmpfs` by default, thus
the disk is involved.

    (C) Relying on a writable FS doesn't always work in early boot of
a VM or container.


    Thanks for the reply,
    Ciprian.

Re: RFC related to reading the script from an arbitrary file descriptor

Reply via email to