On 14.03.2015 03:40, Laurent Bercot wrote:
Hm, after checking, you're right: the guarantee of atomicity
applies with nonblocking IO too, i.e. there are no short writes.
Which is a good thing, as long as you know that no message will
exceed PIPE_BUF - and that is, for now, the case with uevents, but
I still don't like to rely on it.
Named pipes are a proven IPC concept, not only in the Unix world. They
are pipes and behave exactly as them, including non blocking I/O,
programming the poll loop, and failure handling. There is only one
difference, the method how to get access to the pipe file descriptors
(either calling pipe or open).
I call "pipe" an anonymous pipe. And an anonymous pipe, created by
the netlink listener when it forks the event handler, is clearly the
right solution, because it is private to those two processes. With
a fifo, aka named pipe, any process with the appropriate file system
access may connect to the pipe, and that is a problem:
Right, any process with root access may write to this pipe, but don't
you think such processes have the ability to do other need things, like
changing the device node entries in the device file system directly?
May processes with root access produce confusion on the pipe?
Yes, but aren't such processes be able to produce any kind of confusion
they like?
We could have (at some slight extra cost):
- create the fifo with devparser:pipegroup 0620
- run hotplug helper (if used) suid:sgid to hotplug:pipegroup
(or drop privileges to that)
- drop netlink reader after socket creation to same user:group
- run the fifo supervisor as devparser:parsergroup
- but then we need to run the parser as suid root
Needs to access device file system and do some operation which require
root (as far as I remember). Any suggestion how to avoid that suid root?
- for writing: why would you need another process to write into the
pipe ? You have *one* authoritative source of information, which is
the netlink listener, any other source is noise, or worse, malevolent.
You stuck on netlink usage and oversee you are forcing others to do it
your way. No doubt about the reasons for using netlink, but why forcing
those who dislike? This forcing won't be different then forcing others
to rely on e.g. systemd? Isn't it? (provocation, don't expected to be
answered)
Where as I'm trying to give the user (or say system maintainer) the
ability to chose the mechanism he likes, and even with the chance to
flip the mechanism, by just modifying one or two parameters or commands.
Flipping the mechanism is even possible in a running system without
disturbance, and without changing configuration.
So why is this approach worser than forcing others to do things in a
specific way? Except those known arguments why netlink is the better
solution, where we absolutely agree.
- for reading: having several readers on the same pipe is the land
of undefined behaviour. You definitely don't want that.
Is here anyone trying to have more than one "reader" on the pipe? The
only one reader of the pipe is the parser, and right as we are using
fifos the parser shouldn't bet on incoming message format and content.
It shall do sanity checks on those before usage (and here we hit the
point, where I expect getting some overhead, not much due to other
changes). Isn't that good practice to do this for other pipes too (even
if a bit more for paranoia)? But all with the benefit of avoiding
re-parsing the conf for every incoming event, and expected over all
speed improvement. Not to talk about the possibility to chose/flip the
mechanism as the user likes.
This even includes extra possibilities for e.g. debugging and watching
purposes. With a simple redirection of the pipe you may add event
logging functionality and/or live display of all event messages
(possibly filtered by a formating script / program). All without extra
cost / impact for normal usage, and without creating special debug
versions of the event handler system.
I'm just trying to make it modular, not monolithic.
- generally speaking, fifos are tricky beasts, with weird and
unclear semantics around what happens when the last writer closes,
different behaviours wrt kernel and even *kernel version*, and more
than their fair share of bugs. Programming a poll() loop around a
fifo is a lot more complicated, counter-intuitive, and brittle, than
it should be (whereas anonymous pipes are easy as pie. Mmm... pie.)
See my statement about fifos above, I don't know what you fear about
fifos, but there usage and functionality is more proven in the Unix
world, as you expect. Sure you need to watch your steps, but this shall
also be done when using pipes (even if only for paranoia, e.g. checking
incoming data before usage and not blind reliance).
And may be there are internal differences on pipe / fifo handling in the
kernels, but likely they are internal and don't change the expected
usage behavior, or else they would risk breaking other pipe
functionality too.
In detail:
Close on last writer: What's unclear on this? What shall here be
different than on other pipes? The trick is, to let it not happen ...
the job of the fifo supervisor daemon: hold the named pipe open and
available for usage (buffer space is only assigned to the pipe by the
kernel, when there is data), as tcpsvd does with a network socket to
accept incoming connections.
Poll loop: Fifos are pipes and need exact the same handling as pipes.
What shall there be more "brittle" than other pipe usage?
I'm talking about the netlink because it's a natural source of streamed
uevents, which is exactly what you want to give to a long-lived uevent
handler such as mdev -i.
ACK, but do you like forcing others to do things in a specific way? Do
you like to be forced by others?
Why not opening up to other possibilities, with a slight different concept?
... write-only mode ...
Definitely NO, I'm trying to read very carefully and understand the
intention of the told, not only the words.
I even understand the fears of e.g. Isaac, but see not much possibility
to stay exact at old mdev behavior, without blocking any kind of
innovation on this (not talking about writing a wrapper around old
suffering operation to add in netlink usage - shrek).
Please take a look at my s6-uevent-listener/s6-uevent-spawner, or at
Natanael's nldev/nldev-handler. The long-lived uevent handler idea is
a *solved problem*.
I know how that works, and this is the problem. I see limitations of
this approach, which I try to overcome.
1) using as netlink mechanism only -> no problem
2) using with kernel hotplug helper mechanism -> fails to use, or still
suffers from re-parsing conf for each event.
3) Open up my mind and accept that next one coming around, may have a
brand new plug mechanism in his bag -> may be difficult to do without
changing code.
So why not allowing to go one step further, to split the known
functional blocks into separate threads, allowing to get a modular
system, which may be used and setup to the maintainers likes?
So who is at pure write-only mode?
Your original plan was to:
- write mdev -i: I think it's a good idea.
Different part, with no (notable) influence on rest.
... but as I'm not in write-only mode, I slightly modified my initial
intention:
Let the parser look for special formatted lines (not much cost), and on
match, it writes out the line to stdout (in a shell friendly format), if
a flag (-i) is set. Otherwise (normal device operation) those lines are
just ignored, like comments. On start of "xdev -i" it is checked if
stdout is a tty, and redirected to /dev/null in that case (don't clobber
console with those lines).
So it's possible to do
xdev -i /etc/xdev-init.conf | sh xdev-init-script
with xdev-init-script:
while read cmd line
do
set -- $line
case $cmd of
TYPE_OF_LINE ) ... react at line as you like;;
esac
done
No further init operation in xdev -i then this, just the possibility to
re-use the parser to pick out some init related lines from file in the
mdev.conf format, without need of a not so trivial sed script, etc.
The sense of "xdev -i CONF_FILE >/dev/null" is to check the conf file
for errors, throw error messages on stderr, and exit not 0 when errors
are detected. To check the CONF_FILE before the device file system is
started, and allow fall back to sane defaults.
We may even go one step further (just thinking loud), check the command
line for extended syntax (e.g. "xdev parser SCRIPT_NAME ARGS"), spawn
the given script (once per parser startup) and then forward the matching
lines for incoming events to that script.
parse the conf file into memory table
if script file given
spawn a pipe for the given script
while read next message with timeout
search for matching line in memory table
if we spawned a script
write out the line information to the pipe
else
do the device operation of the matching entry
- modify mdev -s to add more functionality than just triggering a
coldplug: I don't think it's good design but I don't care as long as I
can configure it out. Other people have also answered. The answers may
not have been the ones you were looking for, but you wanted feedback,
you got feedback.
So you think it is not worth of have some improved symlink handling for
new device nodes?
currently:
=path - allow to move new device to e.g. a subdirectory
>path - move the device and add a symlink pointing to device
... but what about leaving the device at it's original name and creating
a symlink pointing to this device (e.g. /dev/cdrom -> /dev/sr0, not
/dev/sr0 -> /dev/cdrom)?
One intended extension:
<path - shall create symlink to device if symlink does not exist
... and what about combining the move and symlink operation? Move new
device to different location *and* add a symlink at a different location?
=symlink_path >new_device_path - overwrite existing symlink
=new_device_path <symlink_path - don't overwrite existing symlink
... and possibly the functionality to include other files in the conf
file, with the possibility to specify a directory, resulting in
including all files from that directory.
What's wrong with my wish getting some extended functionality for more
simplicity? Not of public interest?
My major intention of extensions is to improved symlink handling, mount
point creation with permission setup (device file system related), and
mounting of file systems (most likely virtual) in the device file system
(e.g. devpts).
Sorry for my hopping on proc and sys, this could be done with the above
extensions without extra cost, but is clearly personal preference.
*No fixed logic*, what gets setup in any system! Just the ability to
focus the device system setup more on "What to setup" (table with a list
of required symlinks, etc.), instead of describing "How to setup"
(calling commands in a shell script).
I'm not interested in providing 15 APIs to do the same thing. Users
don't like the netlink ? Tough, if they want serialized uevents. What
would you do if your kid wanted to drive a car but said he didn't like
steering wheels, would you build him a car with a joystick ?
To stay at your example: What about, you see the possibility to build a
base car, which has the wheel steering module, replaceable with a
joystick module. Now every body is fine, as he can plug in the steering
module he like. What is wrong with this? At least the base idea, not
looking at the cost, which need to stay in acceptable tolerance.
I provided you with clear designs and working code. So did other
people. (You said code is premature at this point. I'm sorry, but no,
code is not premature when the problem is solved, and if you're not
convinced, please simply study the code, which is extremely short.)
I don't need further studying your code, to understand your approach. I
see the cave-eats of this and try to overcome them.
... but I can't overcome, if you like forcing others to do it your way :(
My clear and *final* statement on this topic is: I want to give every
maintainer the possibility to use the plug mechanism he like, but still
benefit from increased speed and reduced memory consumption on event
bursts. ... *not* replacing one mechanism by another or adding a wrapper
for some new technology, letting those who dislike to stay in the back,
still suffering from known problems.
Now it's all up to you. I would like to see a mdev -i, can you work
on it ?
Sure ... as I told: planning then code hacking, but see above as a
possible alternative, before I start any hacking.
If you prefer to keep beating around the bush and smoking crack
about fifo superservers, it's fine too, but I'm just not interested.
... (ohps, I told *final statement*) ...
... but extend the above final statement with the following for more
clarity:
I know, I dropped my head frustrated in the sand, due to some comments
here (most likely from Denys), but I'm at a point, I have three possible
solutions:
1) get some extended functionality into Busybox
2) fork the project and run a "MyBusybox" project (private or public)
3) drop my development interests (that means all) forever
As I highly disregard #3, and do not like the #2 way (due to all sorts
of known problems, as I did that way already, in the private), I prefer
(and at least for now: insist) on #1 ... but have no other chance as to
say "good bye" and hop on #2, if it's impossible to find a working
solution for #1 :(
It's not me, forcing others to setup there system in a specific way, or
to chose a specific mechanism, or to be left out and stuck on known
suffering code.
It's up to you ... you all ...
... but new automobiles are dangerous, don't use them ...
(final: I promise! except -> private)
--
Harald
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox