Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Wed, Dec 17, 2014 at 08:04:38PM +0100, Lennart Poettering wrote: This is important details, because if you use epoll file descriptor in another epoll then you're correctly notified on the top-level epoll, but you lost information about which underneath file descriptor is active -- then it was impossible to verify the inotify IN_MOVED_TO utab event. Hmm? Not following. The top-level epoll will get an event telling you that which low-level epoll is triggered. Then, you read an event from that which tells you precisely which actual file has been triggered... Yes, was my original idea, then followed by frustration ;-) It seems it the hierarchy of epolls works only if all the file descriptors are with EPOLLIN. I had /proc/self/mountinfo with EPOLLPRI only (because with EPOLLIN it generates events all time as I don't read the file content). Today I played with it a little bit more and I found that possible solution is to use EPOLLPRI | EPOLLIN | EPOLLET for the /proc/self/mountinfo. Ideas? Karel PS. if anyone wants to play with it then below is test program, just copy and past to test.c $ make test $ touch AAA BBB $ ./test AAA BBB and cat AAA or mount something on another terminal #include stdio.h #include stdlib.h #include err.h #include sys/epoll.h #include sys/inotify.h #include fcntl.h #include sys/stat.h #include fcntl.h static int get_inotify(const char *filename) { int fd = inotify_init1(IN_NONBLOCK | IN_CLOEXEC); if (inotify_add_watch(fd, filename, IN_CLOSE_NOWRITE) 0) err(EXIT_FAILURE, %s: add watch failed, filename); return fd; } static void clean_inotify_buf(int fd) { char buf[BUFSIZ]; while (read(fd, buf, sizeof(buf)) 0); /* non-blocking */ } int main(int argc, char **argv) { int a, b, c; int low_efd, high_efd; struct epoll_event ev; /* low epoll */ low_efd = epoll_create1(EPOLL_CLOEXEC); if (low_efd 0) err(EXIT_FAILURE, failed to create epoll); ev.events = EPOLLPRI | EPOLLIN; a = ev.data.fd = get_inotify(argv[1]); if (epoll_ctl(low_efd, EPOLL_CTL_ADD, a, ev) 0) err(EXIT_FAILURE, failed to add %s to low-epoll, argv[1]); ev.events = EPOLLPRI | EPOLLIN; b = ev.data.fd = get_inotify(argv[2]); if (epoll_ctl(low_efd, EPOLL_CTL_ADD, b, ev) 0) err(EXIT_FAILURE, failed to add %s to low-epoll, argv[2]); ev.events = EPOLLPRI | EPOLLIN | EPOLLET; c = ev.data.fd = open(/proc/self/mountinfo, O_RDONLY | O_CLOEXEC); if (epoll_ctl(low_efd, EPOLL_CTL_ADD, c, ev) 0) err(EXIT_FAILURE, failed to add mountinfo to low-epoll); /* high epoll */ high_efd = epoll_create1(EPOLL_CLOEXEC); if (high_efd 0) err(EXIT_FAILURE, failed to create high-epoll); ev.events = EPOLLPRI | EPOLLIN; ev.data.fd = low_efd; if (epoll_ctl(high_efd, EPOLL_CTL_ADD, low_efd, ev) 0) err(EXIT_FAILURE, failed to add to high-epoll); fprintf(stderr, top=%d\n, high_efd); fprintf(stderr, |\n); fprintf(stderr, low=%d\n, low_efd); fprintf(stderr, / | \\\n); fprintf(stderr, A=%d B=%d C=%d\n\n, a, b, c); do { struct epoll_event events[1]; int n; fprintf(stderr, Wainting for event...\n); n = epoll_wait(high_efd, events, 1, -1); if (n 0) err(EXIT_FAILURE, high-epoll wait failed); if (!n) continue; fprintf(stderr, *** has high event (fd=%d)\n, events[0].data.fd); do { n = epoll_wait(low_efd, events, 1, 0); if (n 0) err(EXIT_FAILURE, low-epoll wait failed); else if (n) { int fd = events[0].data.fd; fprintf(stderr,*** has low event (fd=%d)\n, fd); if (fd == a || fd == b) clean_inotify_buf(fd); } else break; /* no event */ } while (1); clean_inotify_buf(a); clean_inotify_buf(b); } while (1); } -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, Dec 12, 2014 at 08:11:54PM +0100, Lennart Poettering wrote: I guess it's enough to add the 'fd' to systmed sd_event_add_io() and call mnt_table_parse_mtab() when a change is detected. (As already implemeted in the original Chris' patch.) Karel, if I got this right, then the new monitor stuff will only wrap inotify on utab, right? I think it would be useful if it would also abstract notifications via /proc/self/mountinfo in it. To make the interface easy for this and to be able to just hand out a single fd, this would mean creating an epoll fd inside the lib, then adding the inotify fd for utab to it, and then on top the EPOLLPRI watch on /proc/self/mountinfo. This way apps would get the full set of notifications via your library, without knowing what's going on underneath, and userspace notifications and kernel notifications would come the same way. OK, implemented (util-linux monitor branch on github). But I have changed the userspace monitor to care about /run/mount/utab.lock file. It's better than the original Chris idea (monitor all /run/mount directory for utab rename changes). This new solution is without possible false positives. This is important details, because if you use epoll file descriptor in another epoll then you're correctly notified on the top-level epoll, but you lost information about which underneath file descriptor is active -- then it was impossible to verify the inotify IN_MOVED_TO utab event. The another advantage is that the event is triggered really after utab update (the update is covered by flock, and close() means unlock). Now: your-epoll | library-epoll-FD /\ mountinfo-FD utab.lock-inotify-FD Note that after notification it's necessary to cleanup (drain) inotify bufferes, so I have added mnt_monitor_event_cleanup(). Example: mn = mnt_new_monitor(); mnt_monitor_enable_userspace(mn, TRUE); mnt_monitor_enable_kernel(mn, TRUE); fd = mnt_monitor_get_fd(mn); ...add 'fd' to your epoll... n = epoll_wait(your_efd, events, 1, -1); ... if (events[0].data.fd == fd) { mnt_monitor_event_cleanup(mn); printf(libmount notification\n); ...parse mount tables... } ... mnt_unref_monitor(mn); The API also provides mnt_monitor_wait() for users who don't want to own epoll, but this is irrelevant for systemd. Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Wed, 17.12.14 13:13, Karel Zak (k...@redhat.com) wrote: On Fri, Dec 12, 2014 at 08:11:54PM +0100, Lennart Poettering wrote: I guess it's enough to add the 'fd' to systmed sd_event_add_io() and call mnt_table_parse_mtab() when a change is detected. (As already implemeted in the original Chris' patch.) Karel, if I got this right, then the new monitor stuff will only wrap inotify on utab, right? I think it would be useful if it would also abstract notifications via /proc/self/mountinfo in it. To make the interface easy for this and to be able to just hand out a single fd, this would mean creating an epoll fd inside the lib, then adding the inotify fd for utab to it, and then on top the EPOLLPRI watch on /proc/self/mountinfo. This way apps would get the full set of notifications via your library, without knowing what's going on underneath, and userspace notifications and kernel notifications would come the same way. OK, implemented (util-linux monitor branch on github). But I have changed the userspace monitor to care about /run/mount/utab.lock file. It's better than the original Chris idea (monitor all /run/mount directory for utab rename changes). This new solution is without possible false positives. Not following here. As long as everybody writing utab moves the file atomically into place I see no reason to watch the lock file ever. This is important details, because if you use epoll file descriptor in another epoll then you're correctly notified on the top-level epoll, but you lost information about which underneath file descriptor is active -- then it was impossible to verify the inotify IN_MOVED_TO utab event. Hmm? Not following. The top-level epoll will get an event telling you that which low-level epoll is triggered. Then, you read an event from that which tells you precisely which actual file has been triggered... The another advantage is that the event is triggered really after utab update (the update is covered by flock, and close() means unlock). Now: your-epoll | library-epoll-FD /\ mountinfo-FD utab.lock-inotify-FD Note that after notification it's necessary to cleanup (drain) inotify bufferes, so I have added mnt_monitor_event_cleanup(). Yupp, makes sense. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, 05.12.14 15:54, Karel Zak (k...@redhat.com) wrote: I have added struct libmnt_monitor to make this new interface easy to extend and usable for more resources (I'll probably also add mountinfo fd for findmnt(8), but this is irrelevant for systemd;-) All you need is: mn = mnt_new_monitor(); fd = mnt_monitor_userspace_get_fd(mn, NULL);/* utab monitor fd */ mnt_monitor_get_events(mn, fd, ev.events); /* EPOLLIN ... */ efd = epoll_create1(EPOLL_CLOEXEC); epoll_ctl(efd, EPOLL_CTL_ADD, fd, ev); n = epoll_wait(efd, events, 1, -1); id (n == 1 mnt_monitor_is_changed(mn, fd) == 1) printf(%s: change detected\n, mnt_monitor_get_filename(mn, fd)); mnt_unref_monitor(mn); close(efd); I guess it's enough to add the 'fd' to systmed sd_event_add_io() and call mnt_table_parse_mtab() when a change is detected. (As already implemeted in the original Chris' patch.) Karel, if I got this right, then the new monitor stuff will only wrap inotify on utab, right? I think it would be useful if it would also abstract notifications via /proc/self/mountinfo in it. To make the interface easy for this and to be able to just hand out a single fd, this would mean creating an epoll fd inside the lib, then adding the inotify fd for utab to it, and then on top the EPOLLPRI watch on /proc/self/mountinfo. This way apps would get the full set of notifications via your library, without knowing what's going on underneath, and userspace notifications and kernel notifications would come the same way. Does this make sense? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, Dec 12, 2014 at 08:11:54PM +0100, Lennart Poettering wrote: On Fri, 05.12.14 15:54, Karel Zak (k...@redhat.com) wrote: I guess it's enough to add the 'fd' to systmed sd_event_add_io() and call mnt_table_parse_mtab() when a change is detected. (As already implemeted in the original Chris' patch.) Karel, if I got this right, then the new monitor stuff will only wrap inotify on utab, right? Right. I think it would be useful if it would also abstract notifications via /proc/self/mountinfo in it. To make the I have had plan to add mnt_monitor_kernel_get_fd(). interface easy for this and to be able to just hand out a single fd, this would mean creating an epoll fd inside the lib, then adding the inotify fd for utab to it, and then on top the EPOLLPRI watch on /proc/self/mountinfo. This way apps would get the full set of notifications via your library, without knowing what's going on underneath, and userspace notifications and kernel notifications would come the same way. I don't want provide only high-level abstraction, sometimes it's useful for developers to have access to low-level things (for example sometimes utab monitoring is unnecessary overkill). It's also possible that in future there will be more things to monitor (mountinfo in another namespaces, FS specific things, ...etc). Maybe the API should be extended to something like: mnt_monitor_enable_userspace(mn, TRUE); mnt_monitor_enable_kernel(mn, TRUE); fd = mnt_monitor_get_fd(mn); where 'fd' is the top level file descriptor to monitor all enabled things. Hmm... OK, next week ;-) Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Wed, Dec 10, 2014 at 01:53:09AM +0100, Lennart Poettering wrote: On Tue, 09.12.14 12:30, Karel Zak (k...@redhat.com) wrote: On Mon, Dec 08, 2014 at 08:10:08PM +0100, Lennart Poettering wrote: Any idea when you intend to realease this new API in a release or even in a stable one? I'd like to have v2.26-rc1 this month. Hmm, OK, then I'll release 218 with the current direct inotify or you can release it without this feature :-) It's all about mount -o _netdev executed from command line and it's unsupported by systemd for years -- I think we can wait one release. Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Mon, Dec 08, 2014 at 08:10:08PM +0100, Lennart Poettering wrote: Any idea when you intend to realease this new API in a release or even in a stable one? I'd like to have v2.26-rc1 this month. Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Tue, 09.12.14 12:30, Karel Zak (k...@redhat.com) wrote: On Mon, Dec 08, 2014 at 08:10:08PM +0100, Lennart Poettering wrote: Any idea when you intend to realease this new API in a release or even in a stable one? I'd like to have v2.26-rc1 this month. Hmm, OK, then I'll release 218 with the current direct inotify hookup. We'll update this in systemd to use lbmount's native APIs as soon as that's stable (and hit Fedora...). I added TODO list items for this for systemd, so that we don't forget to do that. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, 05.12.14 15:54, Karel Zak (k...@redhat.com) wrote: On Fri, Dec 05, 2014 at 01:46:06AM +0100, Lennart Poettering wrote: With such an API you have the liberty to change later on what precisely you expose there. The fact that you watch a file would be entirely opaque, it could one day be a pipe or socket, or even an fd on some kernel fd, where you just tell us the kind of events you want the user code to listen on. This is how we wrap a lot of our own APIs, to allow integration into arbitrary main loops, without restricting us to decide how precisely the event is generated. Does this make sense? Yep. Would love to get an API in place for this in libmount, because I don't really want to release systemd with the current code. Implemented. I have added struct libmnt_monitor to make this new interface easy to extend and usable for more resources (I'll probably also add mountinfo fd for findmnt(8), but this is irrelevant for systemd;-) All you need is: mn = mnt_new_monitor(); fd = mnt_monitor_userspace_get_fd(mn, NULL);/* utab monitor fd */ mnt_monitor_get_events(mn, fd, ev.events); /* EPOLLIN ... */ efd = epoll_create1(EPOLL_CLOEXEC); epoll_ctl(efd, EPOLL_CTL_ADD, fd, ev); n = epoll_wait(efd, events, 1, -1); id (n == 1 mnt_monitor_is_changed(mn, fd) == 1) printf(%s: change detected\n, mnt_monitor_get_filename(mn, fd)); mnt_unref_monitor(mn); close(efd); I guess it's enough to add the 'fd' to systmed sd_event_add_io() and call mnt_table_parse_mtab() when a change is detected. (As already implemeted in the original Chris' patch.) usable example: https://github.com/karelzak/util-linux/blob/master/libmount/src/monitor.c#L345 Any idea when you intend to realease this new API in a release or even in a stable one? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, Dec 05, 2014 at 01:46:06AM +0100, Lennart Poettering wrote: With such an API you have the liberty to change later on what precisely you expose there. The fact that you watch a file would be entirely opaque, it could one day be a pipe or socket, or even an fd on some kernel fd, where you just tell us the kind of events you want the user code to listen on. This is how we wrap a lot of our own APIs, to allow integration into arbitrary main loops, without restricting us to decide how precisely the event is generated. Does this make sense? Yep. Would love to get an API in place for this in libmount, because I don't really want to release systemd with the current code. Implemented. I have added struct libmnt_monitor to make this new interface easy to extend and usable for more resources (I'll probably also add mountinfo fd for findmnt(8), but this is irrelevant for systemd;-) All you need is: mn = mnt_new_monitor(); fd = mnt_monitor_userspace_get_fd(mn, NULL);/* utab monitor fd */ mnt_monitor_get_events(mn, fd, ev.events); /* EPOLLIN ... */ efd = epoll_create1(EPOLL_CLOEXEC); epoll_ctl(efd, EPOLL_CTL_ADD, fd, ev); n = epoll_wait(efd, events, 1, -1); id (n == 1 mnt_monitor_is_changed(mn, fd) == 1) printf(%s: change detected\n, mnt_monitor_get_filename(mn, fd)); mnt_unref_monitor(mn); close(efd); I guess it's enough to add the 'fd' to systmed sd_event_add_io() and call mnt_table_parse_mtab() when a change is detected. (As already implemeted in the original Chris' patch.) usable example: https://github.com/karelzak/util-linux/blob/master/libmount/src/monitor.c#L345 Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Mon, 01.12.14 16:47, Karel Zak (k...@redhat.com) wrote: On Mon, Dec 01, 2014 at 02:28:33PM +0100, Zbigniew Jędrzejewski-Szmek wrote: Wouldn't be enough to use Chris' iSCSI and FCoE auto detection? Please see previous discussion... Detecting network might not be trivial if the devices are layered and there's a network-requiring device somewhere lower in the stack. Good point. (It would be possible to analyze whole stack by slave/holders relations, but I agree it's too complex and probably too fragile.) Anyway, if you still want to read userspace mount options than I can add something like: mnt_inotify_mtab_add_watch() mnt_inotify_mtab_rm_watch() mnt_inotify_mtab_changed() to hide all the paths and private mtab/utab stuff. Maybe something more like what journal client code does here: mnt_mtab_get_fd() - epoll fd mnt_mtab_get_events() - EPOLLIN|EPOLLPRI mnt_mtab_process() - information what changed Hmm.. utab is optional and very often does not exist, and library uses rename(2) to do atomic update of the file. This is reason why Chris have used inotify for /run/mount directory (and Lennart asked for inotify API). I have doubts you can use epoll to monitor directories, epoll is about I/O monitoring. Well, what Zbigniew was suggesting is to hide the fact that inotify is used away, and instead do that internally. I.e. provide three calls: one that tells the user of this the fd to listen on, the second that tells the user of this the poll events to listen on, and finally one function to call when actually one of those poll events is triggred. With such an API you have the liberty to change later on what precisely you expose there. The fact that you watch a file would be entirely opaque, it could one day be a pipe or socket, or even an fd on some kernel fd, where you just tell us the kind of events you want the user code to listen on. This is how we wrap a lot of our own APIs, to allow integration into arbitrary main loops, without restricting us to decide how precisely the event is generated. Does this make sense? Would love to get an API in place for this in libmount, because I don't really want to release systemd with the current code. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Sun, 23.11.14 20:33, Chris Leech (cle...@redhat.com) wrote: This adds auto detection for iSCSI and some FCoE drivers and treats mounts to file-systems on those devices as remote-fs. Hmm, I am not too fond of having such subsystem-specific complexity in systemd. Hmm, isn't this something that could be fixed in the kernel in a generic way? Introduce a new generic sysfs attr in a block device that indicates that some block device involves networking? Similar to the rotational flag? I mean, I am not convinced that systemd should be the place where this is abstracted, I'd much rather see the kernel do this for us, and us simply making use of the information. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, Nov 28, 2014 at 09:27:43PM +0100, Lennart Poettering wrote: On Fri, 28.11.14 20:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote: On Sun, Nov 23, 2014 at 08:33:41PM -0800, Chris Leech wrote: This adds auto detection for iSCSI and some FCoE drivers and treats mounts to file-systems on those devices as remote-fs. Signed-off-by: Chris Leech cle...@redhat.com No need for this. I now pushed patches 1-4 with some small changes here and there. Since libmount is not optional, I removed if from the version string. This patch I didn't push: this seems like something that would be better done through udev rules, by setting some tags. Then systemd could simply check for the presence of a tag on the device without having any special knowledge about iscsi and firends. Honestly, I am really not sure I like this patch. One one hand we now have a hard dep on libmount. Which I figure is mostly OK. However, what I find really weird about this is that even though libmount is supposed to abstract access to the utab away, it doesn't sufficiently: we still hardcode the utab path now so that we can watch it. I mean, either we use an abstracted interface or we don't. THis really smells to me as either libmount should provide some form of inotify iface to utab, or we should parse utab directly. If libmount is the only and official API for utab, then we should that. If the utab file however is API too, then we can well go ahead and parse it directly. I'd like to keep the utab file private. The whole libmount API is abstraction and completely hides the difference between original /etc/mtab and new /run/mount/utab. The original Chris' purpose for the patches has been _netdev, it means to improve detection of dependence between filesystem and network. I still believe that the right way is to check for network block devices than rely on crazy _netdev userspace mount option. Wouldn't be enough to use Chris' iSCSI and FCoE auto detection? Or we can add udev rule to have NET_DEVBLK tag in udevdb, or maybe we can ask kernel developers to add /sys/block/sdb/network (as we already have removable in /sys). Chris, why do you want both ways (utab and the auto detection)? IMHO it would be really nice to completely kill _netdev in systemd universe ;-) It's legacy from shell init scripts. Anyway, if you still want to read userspace mount options than I can add something like: mnt_inotify_mtab_add_watch() mnt_inotify_mtab_rm_watch() mnt_inotify_mtab_changed() to hide all the paths and private mtab/utab stuff. THe new code looks also buggy to me. For example, am I missing something or is nobody creating /run/mount? I mean, we need to make sure that it exists before we can install an inotify watch on it. yep, it should be hidden within libmount Also, it leaves the cescape() calls in for what we read from libmount, which makes a lot of alarm bells ring in my head: doesn't libmount do that on its own? sure, libmount encodes/decodes all when read/write the files. Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Mon, Dec 01, 2014 at 11:24:58AM +0100, Karel Zak wrote: On Fri, Nov 28, 2014 at 09:27:43PM +0100, Lennart Poettering wrote: On Fri, 28.11.14 20:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote: On Sun, Nov 23, 2014 at 08:33:41PM -0800, Chris Leech wrote: This adds auto detection for iSCSI and some FCoE drivers and treats mounts to file-systems on those devices as remote-fs. Signed-off-by: Chris Leech cle...@redhat.com No need for this. I now pushed patches 1-4 with some small changes here and there. Since libmount is not optional, I removed if from the version string. This patch I didn't push: this seems like something that would be better done through udev rules, by setting some tags. Then systemd could simply check for the presence of a tag on the device without having any special knowledge about iscsi and firends. Honestly, I am really not sure I like this patch. One one hand we now have a hard dep on libmount. Which I figure is mostly OK. However, what I find really weird about this is that even though libmount is supposed to abstract access to the utab away, it doesn't sufficiently: we still hardcode the utab path now so that we can watch it. I mean, either we use an abstracted interface or we don't. THis really smells to me as either libmount should provide some form of inotify iface to utab, or we should parse utab directly. If libmount is the only and official API for utab, then we should that. If the utab file however is API too, then we can well go ahead and parse it directly. I'd like to keep the utab file private. The whole libmount API is abstraction and completely hides the difference between original /etc/mtab and new /run/mount/utab. The original Chris' purpose for the patches has been _netdev, it means to improve detection of dependence between filesystem and network. I still believe that the right way is to check for network block devices than rely on crazy _netdev userspace mount option. Wouldn't be enough to use Chris' iSCSI and FCoE auto detection? Please see previous discussion... Detecting network might not be trivial if the devices are layered and there's a network-requiring device somewhere lower in the stack. Using _netdev is supposed to be an easy mechanism which admins can use in case whatever other schemes we have in place don't work. Or we can add udev rule to have NET_DEVBLK tag in udevdb, or maybe we can ask kernel developers to add /sys/block/sdb/network (as we already have removable in /sys). Chris, why do you want both ways (utab and the auto detection)? IMHO it would be really nice to completely kill _netdev in systemd universe ;-) It's legacy from shell init scripts. Anyway, if you still want to read userspace mount options than I can add something like: mnt_inotify_mtab_add_watch() mnt_inotify_mtab_rm_watch() mnt_inotify_mtab_changed() to hide all the paths and private mtab/utab stuff. Maybe something more like what journal client code does here: mnt_mtab_get_fd() - epoll fd mnt_mtab_get_events() - EPOLLIN|EPOLLPRI mnt_mtab_process() - information what changed and then mnt_mtab_wait() can be constructed on top of this, but systemd wouldn't be using it. THe new code looks also buggy to me. For example, am I missing something or is nobody creating /run/mount? I mean, we need to make sure that it exists before we can install an inotify watch on it. yep, it should be hidden within libmount Also, it leaves the cescape() calls in for what we read from libmount, which makes a lot of alarm bells ring in my head: doesn't libmount do that on its own? sure, libmount encodes/decodes all when read/write the files. Zbyszek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Mon, Dec 01, 2014 at 02:28:33PM +0100, Zbigniew Jędrzejewski-Szmek wrote: Wouldn't be enough to use Chris' iSCSI and FCoE auto detection? Please see previous discussion... Detecting network might not be trivial if the devices are layered and there's a network-requiring device somewhere lower in the stack. Good point. (It would be possible to analyze whole stack by slave/holders relations, but I agree it's too complex and probably too fragile.) Anyway, if you still want to read userspace mount options than I can add something like: mnt_inotify_mtab_add_watch() mnt_inotify_mtab_rm_watch() mnt_inotify_mtab_changed() to hide all the paths and private mtab/utab stuff. Maybe something more like what journal client code does here: mnt_mtab_get_fd() - epoll fd mnt_mtab_get_events() - EPOLLIN|EPOLLPRI mnt_mtab_process() - information what changed Hmm.. utab is optional and very often does not exist, and library uses rename(2) to do atomic update of the file. This is reason why Chris have used inotify for /run/mount directory (and Lennart asked for inotify API). I have doubts you can use epoll to monitor directories, epoll is about I/O monitoring. Karel -- Karel Zak k...@redhat.com http://karelzak.blogspot.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, 28.11.14 20:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote: On Sun, Nov 23, 2014 at 08:33:41PM -0800, Chris Leech wrote: This adds auto detection for iSCSI and some FCoE drivers and treats mounts to file-systems on those devices as remote-fs. Signed-off-by: Chris Leech cle...@redhat.com No need for this. I now pushed patches 1-4 with some small changes here and there. Since libmount is not optional, I removed if from the version string. This patch I didn't push: this seems like something that would be better done through udev rules, by setting some tags. Then systemd could simply check for the presence of a tag on the device without having any special knowledge about iscsi and firends. Honestly, I am really not sure I like this patch. One one hand we now have a hard dep on libmount. Which I figure is mostly OK. However, what I find really weird about this is that even though libmount is supposed to abstract access to the utab away, it doesn't sufficiently: we still hardcode the utab path now so that we can watch it. I mean, either we use an abstracted interface or we don't. THis really smells to me as either libmount should provide some form of inotify iface to utab, or we should parse utab directly. If libmount is the only and official API for utab, then we should that. If the utab file however is API too, then we can well go ahead and parse it directly. (Karel, could you please comment on this?) THe new code looks also buggy to me. For example, am I missing something or is nobody creating /run/mount? I mean, we need to make sure that it exists before we can install an inotify watch on it. Then, the patch uses strcmp() and ints as bools, and things, misses error checking on unit_add_dependency_by_name(), defines its own desctruors instead of using DEFINE_TRIVIAL_CLEANUP. Also, it leaves the cescape() calls in for what we read from libmount, which makes a lot of alarm bells ring in my head: doesn't libmount do that on its own? Anyway, this patch really looks like it should have gone through a couple of more revisions before it got merged. And it raises a couple of general questions regarding utab and libmount and what is API and what isn't. (Sorry that I didn't find time to review this more quickly.) Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
On Fri, Nov 28, 2014 at 09:27:43PM +0100, Lennart Poettering wrote: On Fri, 28.11.14 20:50, Zbigniew Jędrzejewski-Szmek (zbys...@in.waw.pl) wrote: On Sun, Nov 23, 2014 at 08:33:41PM -0800, Chris Leech wrote: This adds auto detection for iSCSI and some FCoE drivers and treats mounts to file-systems on those devices as remote-fs. Signed-off-by: Chris Leech cle...@redhat.com No need for this. I now pushed patches 1-4 with some small changes here and there. Since libmount is not optional, I removed if from the version string. This patch I didn't push: this seems like something that would be better done through udev rules, by setting some tags. Then systemd could simply check for the presence of a tag on the device without having any special knowledge about iscsi and firends. Honestly, I am really not sure I like this patch. One one hand we now have a hard dep on libmount. Which I figure is mostly OK. However, what I find really weird about this is that even though libmount is supposed to abstract access to the utab away, it doesn't sufficiently: we still hardcode the utab path now so that we can watch it. I mean, either we use an abstracted interface or we don't. THis really smells to me as either libmount should provide some form of inotify iface to utab, or we should parse utab directly. If libmount is the only and official API for utab, then we should that. If the utab file however is API too, then we can well go ahead and parse it directly. (Karel, could you please comment on this?) THe new code looks also buggy to me. For example, am I missing something or is nobody creating /run/mount? I mean, we need to make sure that it exists before we can install an inotify watch on it. This is done in a my follow up patch. Then, the patch uses strcmp() and ints as bools, and things, misses error checking on unit_add_dependency_by_name(), defines its own desctruors instead of using DEFINE_TRIVIAL_CLEANUP. I fixed one place in a follow up patch, but I might have missed some. As for the DEFINE_TRIVIAL_CLEANUP: yeah, I looked at that, but thought that since mnt_free_* accept NULLs the extra check added by DEFINE_TRIVIAL_CLEANUP would be unnecessary. Looking at it again, this kind of micro-opt is stupid. I'll remove those custom functions in favour of DEFINE_TRIVIAL_CLEANUP. Also, it leaves the cescape() calls in for what we read from libmount, which makes a lot of alarm bells ring in my head: doesn't libmount do that on its own? Anyway, this patch really looks like it should have gone through a couple of more revisions before it got merged. And it raises a couple of general questions regarding utab and libmount and what is API and what isn't. (Sorry that I didn't find time to review this more quickly.) Zbyszek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 5/5] mount: auto-detect iSCSI and FCoE as requiring network
This adds auto detection for iSCSI and some FCoE drivers and treats mounts to file-systems on those devices as remote-fs. Signed-off-by: Chris Leech cle...@redhat.com --- src/core/mount.c | 168 +++ 1 file changed, 158 insertions(+), 10 deletions(-) diff --git a/src/core/mount.c b/src/core/mount.c index 513dcec..52a4ba7 100644 --- a/src/core/mount.c +++ b/src/core/mount.c @@ -26,6 +26,7 @@ #include signal.h #include libmount.h #include sys/inotify.h +#include libudev.h #include manager.h #include unit.h @@ -44,6 +45,7 @@ #include bus-errors.h #include exit-status.h #include def.h +#include udev-util.h static const UnitActiveState state_translation_table[_MOUNT_STATE_MAX] = { [MOUNT_DEAD] = UNIT_INACTIVE, @@ -64,20 +66,166 @@ static const UnitActiveState state_translation_table[_MOUNT_STATE_MAX] = { static int mount_dispatch_timer(sd_event_source *source, usec_t usec, void *userdata); static int mount_dispatch_io(sd_event_source *source, int fd, uint32_t revents, void *userdata); -static bool mount_needs_network(const char *options, const char *fstype) { +static bool scsi_host_is(struct udev_device *host, const char *type) { +_cleanup_free_ char *path; +struct stat st; + +assert(host); +assert(type); + +/* a type_host device created by the scsi transport class + * should exists for all the transport types looked for */ + +if (asprintf(path, %s/%s_host, udev_device_get_syspath(host), type) 0) { +log_oom(); +return false; +} +return stat(path, st) == 0 S_ISDIR(st.st_mode); +} + +static bool scsi_host_is_network(struct udev_device *block) { +struct udev_device *host = NULL; +struct udev *udev; + +assert(block); +udev = udev_device_get_udev(block); + +host = udev_device_get_parent_with_subsystem_devtype(block, scsi, scsi_host); +if (!host) +return false; + +/* iSCSI */ +if (scsi_host_is(host, iscsi)) +return true; + +/* FCoE appears as an FC host with a parent FCoE Ctlr device + * This will at least detect the software fcoe module and bnx2fc on kernels = 3.5 */ +if (scsi_host_is(host, fc)) { +_cleanup_free_ char *fc_host_path = NULL; +_cleanup_udev_device_unref_ struct udev_device *fc_host = NULL; +struct udev_device *fcoe_ctlr = NULL; + +if (asprintf(fc_host_path, %s/fc_host/%s, udev_device_get_syspath(host), udev_device_get_sysname(host)) 0) { +log_oom(); +return false; +}; +fc_host = udev_device_new_from_syspath(udev, fc_host_path); +fcoe_ctlr = udev_device_get_parent_with_subsystem_devtype(fc_host, fcoe, fcoe_ctlr); +if (fcoe_ctlr) +return true; +} + +return false; +} + +static int dot_filter(const struct dirent *d) { +if (streq(., d-d_name) || streq(.., d-d_name)) +return 0; +return 1; +} + +static inline bool is_partition(struct udev_device *block) { +const char *devtype; + +devtype = udev_device_get_devtype(block); +if (streq(devtype, partition)) { +return true; +} +return false; +} + +static bool block_needs_network(struct udev_device *block) { +struct udev_device *parent = NULL; +_cleanup_free_ char *slaves_path = NULL; +_cleanup_free_ struct dirent **slaves = NULL; +struct udev *udev; +int n, i; +bool rc = false; + +assert(block); +udev = udev_device_get_udev(block); + +if (scsi_host_is_network(block)) +return true; + +/* if this is a partition, check the parent device */ + +if (is_partition(block)) { +parent = udev_device_get_parent(block); +if (parent block_needs_network(parent)) +return true; +} + +/* if this block device has slaves check them as well + * this handles DM maps including LVM and multipath */ + +if (asprintf(slaves_path, %s/slaves, udev_device_get_syspath(block)) 0) { +log_oom(); +return false; +} + +n = scandir(slaves_path, slaves, dot_filter, alphasort); +for (i = 0; i n; i++) { +_cleanup_udev_device_unref_ struct udev_device *slave = NULL; +_cleanup_free_ char *newpath = NULL; +_cleanup_free_ char *rp = NULL; + +if (asprintf(newpath, %s/%s, slaves_path, slaves[i]-d_name) 0) { +log_oom(); +goto free_the_slaves; +} +rp = realpath(newpath, NULL); +if (!rp) { +