Re: [systemd-devel] fsckd needs to go
2015-05-22 20:23 GMT+08:00 Martin Pitt : > Hello Lennart, > > sorry for the late answer, got stuck in different things in the past > two weeks.. > > Lennart Poettering [2015-04-28 17:33 +0200]: >> On Fri, 03.04.15 14:58, Lennart Poettering (lenn...@poettering.net) wrote: >> >> > systemd-fsckd would try to connect to some AF_UNIX/SOCK_STREAM socket >> > in the fs, after forking and before execing fsck in the child, and >> > pass the connected socket to fsck via the -C switch. If the socket is >> > not connectable it would avoid any -C switch. With this simple change >> > you can make this work for you: simply write a daemon (outside of >> > systemd) that listens on that sockets and reads the progress data from >> > it. Using SO_PEERCRED you can query which fsck PID this is from and >> > use it to kill it. You could even add this to ply natively if you >> > wish, since it's kinda strange to bump this all off another daemon in >> > the middle, unnecessarily. >> >> I implemented this now, and removed fsckd in the progress. The >> progress data is now available on /run/systemd/fsck.progress which >> should be an AF_UNIX/SOCK_STREAM socket. > > Great, thanks! This works fine, it's very similar to what Didier did > before. I. e. fsckd essentially works almost unmodified (except for > adjusting the socket path). > > So we'll maintain that patch downstream now. It makes maintaining > translations harder, but so be it. > >> Please test this, I only did some artifical testing myself, since I >> don't use file systems that require fsck anymore myself. > > Neither do I, but there's always test/mocks/fsck which works very > nicely. > > Thanks, > > Martin > > -- > Martin Pitt| http://www.piware.de > Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) > ___ > systemd-devel mailing list > systemd-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/systemd-devel Hey, Just mention it, we've implemented similar fsck progress report in LOonux3[1] several years ago. FYI: * http://lists.freedesktop.org/archives/systemd-devel/2011-June/002654.html * patch for systemd: https://github.com/cee1/systemd/commit/c04c709880f0619434ff58580609300d892f281b * patch for plymouth: https://github.com/cee1/plymouth/commit/5be1bb7751b547fe5c125a42c3f2fe607568fa0f -- 1. http://dev.lemote.com/category/loonux3 Regards, - cee1 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Hello Lennart, sorry for the late answer, got stuck in different things in the past two weeks.. Lennart Poettering [2015-04-28 17:33 +0200]: > On Fri, 03.04.15 14:58, Lennart Poettering (lenn...@poettering.net) wrote: > > > systemd-fsckd would try to connect to some AF_UNIX/SOCK_STREAM socket > > in the fs, after forking and before execing fsck in the child, and > > pass the connected socket to fsck via the -C switch. If the socket is > > not connectable it would avoid any -C switch. With this simple change > > you can make this work for you: simply write a daemon (outside of > > systemd) that listens on that sockets and reads the progress data from > > it. Using SO_PEERCRED you can query which fsck PID this is from and > > use it to kill it. You could even add this to ply natively if you > > wish, since it's kinda strange to bump this all off another daemon in > > the middle, unnecessarily. > > I implemented this now, and removed fsckd in the progress. The > progress data is now available on /run/systemd/fsck.progress which > should be an AF_UNIX/SOCK_STREAM socket. Great, thanks! This works fine, it's very similar to what Didier did before. I. e. fsckd essentially works almost unmodified (except for adjusting the socket path). So we'll maintain that patch downstream now. It makes maintaining translations harder, but so be it. > Please test this, I only did some artifical testing myself, since I > don't use file systems that require fsck anymore myself. Neither do I, but there's always test/mocks/fsck which works very nicely. Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Fri, 03.04.15 14:58, Lennart Poettering (lenn...@poettering.net) wrote: > systemd-fsckd would try to connect to some AF_UNIX/SOCK_STREAM socket > in the fs, after forking and before execing fsck in the child, and > pass the connected socket to fsck via the -C switch. If the socket is > not connectable it would avoid any -C switch. With this simple change > you can make this work for you: simply write a daemon (outside of > systemd) that listens on that sockets and reads the progress data from > it. Using SO_PEERCRED you can query which fsck PID this is from and > use it to kill it. You could even add this to ply natively if you > wish, since it's kinda strange to bump this all off another daemon in > the middle, unnecessarily. I implemented this now, and removed fsckd in the progress. The progress data is now available on /run/systemd/fsck.progress which should be an AF_UNIX/SOCK_STREAM socket. If you listen on it you will get the raw fsck progress data though it. With SO_PEERCRED you can figure out which fsck process is on the other side. If you do not listen on it the progress data is instead printed to /dev/console after converting it to percentage data. Please test this, I only did some artifical testing myself, since I don't use file systems that require fsck anymore myself. Sorry again for communicating this so badly initially! Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go -- possible compromise?
On Wed, Apr 8, 2015 at 4:18 PM, Martin Pitt wrote: > Lennart Poettering [2015-04-07 16:14 +0200]: >> Well, the asnc IO socket handling thing was not dealt with. The newest >> patches still use fgets(). >> [...] >> The killer issue really is the safety issue. We shouldn't include >> code in systemd that makes dangerous things like killing running >> fscks an easily accessible operation, that has a graphical UI and >> requires no authentication. > > So, would you reconsider your position if we address the two things > above? I. e. replace fgets() by our own async buffering, and entirely > remove the cancel support? Then we'd still get a proper feedback > during boot instead of leaving the user in the dark why booting is > stuck, but it stays noninteractive. I don't think there is enough justification for a fsck daemon. Large filesystems which need fsck in userspace are a thing from the past and insufficiently developed technology for today's operating system tasks. Basic filesystem consistency and maintenance tasks belong into the kernel and nowhere else. We made it just fine into the year 2015 with the support for the legacy filesystems, and we did not need a specialized daemon so far. Therefore, we can except that the current level of support will be sufficient for the coming years. We will support them well enough until everybody will finally realize that they do not solve the problems we face today, and that they need to be replaced. Please keep things like fsckd in the distribution that wants to make such promises about legacy technology. Systemd upstream should focus on current and future technologies and not pimp up outdated facilities, waste our time and and add more complex logic and rules in the basic boot process. Thanks, Kay ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go -- possible compromise?
Hello all, Lennart Poettering [2015-04-07 16:14 +0200]: > Well, the asnc IO socket handling thing was not dealt with. The newest > patches still use fgets(). > [...] > The killer issue really is the safety issue. We shouldn't include > code in systemd that makes dangerous things like killing running > fscks an easily accessible operation, that has a graphical UI and > requires no authentication. So, would you reconsider your position if we address the two things above? I. e. replace fgets() by our own async buffering, and entirely remove the cancel support? Then we'd still get a proper feedback during boot instead of leaving the user in the dark why booting is stuck, but it stays noninteractive. Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Wed, 08.04.15 13:03, Reindl Harald (h.rei...@thelounge.net) wrote: > >>https://bugzilla.redhat.com/show_bug.cgi?id=1105877 > > > >Hmm? i don't understand what that bug is about? Is it about /forcefsck > >being ignored? > > it is about "warning: checktime reached, running e2fsck is recommended" but > the check didn't happen and that you *need* to "touch /forcefsck" while it > should happen automatically OK, reassigned to the kernel. it's somewhere between the kernel ane e2fsck to figure this out. We will always call fsck, it's up to fsck to do something, and if it decides not to, then it would needs to say why, and get the kernel in sync... > >And what does this bug have to do with systemd? > > i don't get your reasoning for "Maybe the right fix for Ubuntu is to stop > enabling the "routine" check logic?" because as seen a few months ago this > routine check is important, otherwise you may not notice existing corruption > (for whatever reason) until it is too late Well, the file system folks at RH decided this makes no sense long ago, please bring this up with them. Also note that the change RH was carrying a long time is now upstream (see Martin's link), hence bring this up with them. systemd is not involved in this. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Am 08.04.2015 um 12:48 schrieb Lennart Poettering: On Wed, 08.04.15 12:31, Reindl Harald (h.rei...@thelounge.net) wrote: Am 08.04.2015 um 12:27 schrieb Lennart Poettering: Well, the routine check is only done by Ubuntu/Debian, it is not enabled on any enterprise distro or on Fedora. Maybe Ubuntu/Debian should also turn this off? Note that the routine check is not different than a normal check really, it just is triggered by a mount counter instead of a dirty flag, that's all. Hence it makes little difference what you cancel, both is dangerous, and a bad idea to allow unauthenticated. Also, to my knowledge plymouth on Ubuntu never showed a different UI for both cases, did it? How is the admin supposed to know when it is just dangerous to cancel the fsck (in your "routine" check case), and when it is extra dangerous (in the non-"routine" check case)? Maybe the right fix for Ubuntu is to stop enabling the "routine" check logic? why would you want to disable it? short before christmas i had a faulty ext4 FS needing even manual confirmation of repairs - i don't think it's a good idea to not trigger that automatically and frankly it *should have been* triggered that way https://bugzilla.redhat.com/show_bug.cgi?id=1105877 Hmm? i don't understand what that bug is about? Is it about /forcefsck being ignored? it is about "warning: checktime reached, running e2fsck is recommended" but the check didn't happen and that you *need* to "touch /forcefsck" while it should happen automatically And what does this bug have to do with systemd? i don't get your reasoning for "Maybe the right fix for Ubuntu is to stop enabling the "routine" check logic?" because as seen a few months ago this routine check is important, otherwise you may not notice existing corruption (for whatever reason) until it is too late signature.asc Description: OpenPGP digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Wed, 08.04.15 12:31, Reindl Harald (h.rei...@thelounge.net) wrote: > > > Am 08.04.2015 um 12:27 schrieb Lennart Poettering: > >Well, the routine check is only done by Ubuntu/Debian, it is not > >enabled on any enterprise distro or on Fedora. Maybe Ubuntu/Debian > >should also turn this off? > > > >Note that the routine check is not different than a normal check > >really, it just is triggered by a mount counter instead of a dirty > >flag, that's all. Hence it makes little difference what you cancel, > >both is dangerous, and a bad idea to allow unauthenticated. > > > >Also, to my knowledge plymouth on Ubuntu never showed a different UI > >for both cases, did it? How is the admin supposed to know when it is > >just dangerous to cancel the fsck (in your "routine" check case), and > >when it is extra dangerous (in the non-"routine" check case)? > > > >Maybe the right fix for Ubuntu is to stop enabling the "routine" check > >logic? > > why would you want to disable it? > > short before christmas i had a faulty ext4 FS needing even manual > confirmation of repairs - i don't think it's a good idea to not trigger that > automatically and frankly it *should have been* triggered that way > > https://bugzilla.redhat.com/show_bug.cgi?id=1105877 Hmm? i don't understand what that bug is about? Is it about /forcefsck being ignored? And what does this bug have to do with systemd? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Lennart Poettering [2015-04-08 12:27 +0200]: > Maybe the right fix for Ubuntu is to stop enabling the "routine" check > logic? This already happened a while ago, through http://git.whamcloud.com/tools/e2fsprogs.git/commitdiff/3daf592646 So this indeed only affects older/upgraded installations. Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Am 08.04.2015 um 12:27 schrieb Lennart Poettering: Well, the routine check is only done by Ubuntu/Debian, it is not enabled on any enterprise distro or on Fedora. Maybe Ubuntu/Debian should also turn this off? Note that the routine check is not different than a normal check really, it just is triggered by a mount counter instead of a dirty flag, that's all. Hence it makes little difference what you cancel, both is dangerous, and a bad idea to allow unauthenticated. Also, to my knowledge plymouth on Ubuntu never showed a different UI for both cases, did it? How is the admin supposed to know when it is just dangerous to cancel the fsck (in your "routine" check case), and when it is extra dangerous (in the non-"routine" check case)? Maybe the right fix for Ubuntu is to stop enabling the "routine" check logic? why would you want to disable it? short before christmas i had a faulty ext4 FS needing even manual confirmation of repairs - i don't think it's a good idea to not trigger that automatically and frankly it *should have been* triggered that way https://bugzilla.redhat.com/show_bug.cgi?id=1105877 signature.asc Description: OpenPGP digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Wed, 08.04.15 10:46, Martin Pitt (martin.p...@ubuntu.com) wrote: > Reindl Harald [2015-04-08 10:32 +0200]: > > nobody needs to ability to cancel a fsck because hardly anybody has a > > insight if the moment doing so is horrible dangerous and givne that fsck > > don't run for fun why would you want to interrupt it and risk data loss? > > You don't risk data loss by interrupting a routine check (that still > happens on ext[234] every so often). Well, the routine check is only done by Ubuntu/Debian, it is not enabled on any enterprise distro or on Fedora. Maybe Ubuntu/Debian should also turn this off? Note that the routine check is not different than a normal check really, it just is triggered by a mount counter instead of a dirty flag, that's all. Hence it makes little difference what you cancel, both is dangerous, and a bad idea to allow unauthenticated. Also, to my knowledge plymouth on Ubuntu never showed a different UI for both cases, did it? How is the admin supposed to know when it is just dangerous to cancel the fsck (in your "routine" check case), and when it is extra dangerous (in the non-"routine" check case)? Maybe the right fix for Ubuntu is to stop enabling the "routine" check logic? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Tue, 07.04.15 18:02, Dimitri John Ledkov (dimitri.j.led...@intel.com) wrote: > On 3 April 2015 at 05:58, Lennart Poettering wrote: > > Heya, > > > > so we discussed the whole fsckd situation a bit more here in Berlin, > > and we came to the conclusion that fsckd really should not exist the > > way it does in systemd. > > > > To start with, the code is really wrong, it should never have been > > merged in its current state, the read/write logic for the sockets is > > completely borked (I cannot even boot my own machine reliably with > > it!). And to my knowledge there has been no attempt to fix all of > > that, even though I asked for it. It also doesn't do at all what I > > suggested initially, as the flow of data is now fsck → systemd-fsck → > > systemd-fsckd → plymouth, and that's just crazy, that's two steps too > > many. systemd is supposed to be a few components playing well > > together, but certainly not a baroque network of components where data > > is passed though four hoops before it reaches the destination... > > > > Then, there's my general reservation with fsckd at all: file systems > > that still require offline fsck are certainly not the future, but we > > develop stuff for the future, and the idea to kill an fsck process > > while it is running is also very very questionnable. There's a reason > > Is this about progress & control data or all things fsck? Well, ext234 require fsck, there's no way around it. We need to call it, and we will. But the idea of beefing this up with an UI and specifically with an unauthenticated way to kill fsck while it is ongoing, which is an inherently unsafe operation, is what I have issues with. > IMHO we do need to continue support ext4, and running fsck.ext4 when > enforced, at least from initramfs, with progress output to the user > and ability to cancel. Or is even fsck.ext4 obsolete these days and > shouldn't be run automatically any more? Nope. ext2, ext3, ext4, fat require an fsck tool to be run, and we will. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Reindl Harald [2015-04-08 10:32 +0200]: > nobody needs to ability to cancel a fsck because hardly anybody has a > insight if the moment doing so is horrible dangerous and givne that fsck > don't run for fun why would you want to interrupt it and risk data loss? You don't risk data loss by interrupting a routine check (that still happens on ext[234] every so often). But anyway, I don't mind much dropping the cancel ability, but we do want a proper progress report. fsck can take an effing long time with large spinning rust, and without progress report users will just consider the boot hanging/broken and switch off the machine. That's a lot riskier :-) Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Am 08.04.2015 um 03:02 schrieb Dimitri John Ledkov: Is this about progress & control data or all things fsck? IMHO we do need to continue support ext4, and running fsck.ext4 when enforced, at least from initramfs, with progress output to the user and ability to cancel nobody needs to ability to cancel a fsck because hardly anybody has a insight if the moment doing so is horrible dangerous and givne that fsck don't run for fun why would you want to interrupt it and risk data loss? signature.asc Description: OpenPGP digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On 3 April 2015 at 05:58, Lennart Poettering wrote: > Heya, > > so we discussed the whole fsckd situation a bit more here in Berlin, > and we came to the conclusion that fsckd really should not exist the > way it does in systemd. > > To start with, the code is really wrong, it should never have been > merged in its current state, the read/write logic for the sockets is > completely borked (I cannot even boot my own machine reliably with > it!). And to my knowledge there has been no attempt to fix all of > that, even though I asked for it. It also doesn't do at all what I > suggested initially, as the flow of data is now fsck → systemd-fsck → > systemd-fsckd → plymouth, and that's just crazy, that's two steps too > many. systemd is supposed to be a few components playing well > together, but certainly not a baroque network of components where data > is passed though four hoops before it reaches the destination... > > Then, there's my general reservation with fsckd at all: file systems > that still require offline fsck are certainly not the future, but we > develop stuff for the future, and the idea to kill an fsck process > while it is running is also very very questionnable. There's a reason Is this about progress & control data or all things fsck? IMHO we do need to continue support ext4, and running fsck.ext4 when enforced, at least from initramfs, with progress output to the user and ability to cancel. Or is even fsck.ext4 obsolete these days and shouldn't be run automatically any more? How this is implemented - e.g. inside systemd project or not, is not relevant, but systemd seems to be a better place for this. In upstart world, this completely was offloaded to mountall which directly passed "special update" messages to plymouthd, which themes could choose to parse and dispaly / act upon. This however was ubuntu-specific patch I believe. The current implementation/integration for systemd-fsck is also heading to plymouth upstream for generic support there in themes, I believe. -- Regards, Dimitri. https://clearlinux.org Open Source Technology Center Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Mon, 06.04.15 15:21, Martin Pitt (martin.p...@ubuntu.com) wrote: > Hello all, Heya, > Lennart Poettering [2015-04-03 16:34 +0200]: > > Well, I had a brief look at this patch, but it still doesn't get the > > socket IO stuff right. It uses synchronous fgets() to read things of > > the sockets, that's still not OK, and is a major thing that is > > wrong. > > fsckd kicks malicious/broken fsck clients which send garbage, but if > you want to do the buffering explicitly I can rework the patch to do > that. It's not about sending garbage. It's about blocking. fsckd is supposed to be daemon talking to multiple clients, and hence it may never block. It's how daemons on UNIX work... Hence fgets() on client sockets has *no* place in the fsckd sources. > That is, if we actually keep fsckd in the upstream sources :-) > I wouldn't like to spend time on this if you already pre-decided to > kick this out, but I would ask to reconsider, and instead discuss > what's wrong with the code. Yeah, we decided to remove this, sorry! I can only recommend to fix the async socket IO thing even if you decide to maintain fsckd outside of systemd. It's just broken! Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
On Mon, 06.04.15 15:12, Martin Pitt (martin.p...@ubuntu.com) wrote: > Hello Lennart, all, > > Lennart Poettering [2015-04-03 14:58 +0200]: > > To start with, the code is really wrong, it should never have been > > merged in its current state, the read/write logic for the sockets is > > completely borked (I cannot even boot my own machine reliably with > > it!). > > This is surprising indeed. If that's not just the journald/logind/D-Bus > corruption (which we still haven't tracked down properly), do you have > a journal of a hung boot? We never saw a boot failure due to fsck so > far, so I'm naturally very interested in seeing what's wrong. Sorry, not logs here, I removed the thing already here. Sorry. > > And to my knowledge there has been no attempt to fix all of that, > > even though I asked for it. > > As far as I see, every point that came up during reviews, including > your recent one about "don't route fsck output through systemd-fsck" > got addressed (that latter patch hasn't been committed though, I > thought you wanted to review it yourself). Well, the asnc IO socket handling thing was not dealt with. The newest patches still use fgets(). Using stdio for processing sockets is generally not a good idea, since its blocking. And since you want to process multiple connections at the same time you don't want blocking. This is really broken. Currently, if one fsck sends half a line, then this causes your daemon to hang forever... THis is not acceptable in our sources, sorry. > > It also doesn't do at all what I suggested initially, as the flow of > > data is now fsck → systemd-fsck → systemd-fsckd → plymouth, and > > that's just crazy, that's two steps too many. > > With the above patch it's fsck -> systemd-fsckd → plymouth, and I > don't see how to eliminate yet another step? For example, by making ply listen directly on the socket, instead of making this indirect via fsckd... > > Then, there's my general reservation with fsckd at all: file systems > > that still require offline fsck are certainly not the future, but we > > develop stuff for the future > > I do agree with the sentiment; let me assure you that we don't easily > spend days on such stuff in vain, but it's because there are millions > of existing installations out there which still do have ext4 and fsck. > If systemd upstreams say "we don't care about existing products, only > about a future with just btrfs" that's your prerogative of course, but > distros need to have a more product-oriented focus :-/ This only is a one reason of many. The killer issue really is the safety issue. We shouldn't include code in systemd that makes dangerous things like killing running fscks an easily accessible operation, that has a graphical UI and requires no authentication. > > I hope such a solution is acceptable? > > The data flow is very similar to what we have now, so this mostly > amounts to maintaining fsckd in the systemd sources vs. maintaining it > separately in Debian/Ubuntu. I'd be interested in what > RHEL/SUSE/Arch/etc. want to do. We never had code for this in Fedora/RHEL, and that's not going to change. The ability to have a graphical UI for killing fscks without authentication was an Ubuntu thing, and I figure it's going to stay one. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Hello all, Lennart Poettering [2015-04-03 16:34 +0200]: > Well, I had a brief look at this patch, but it still doesn't get the > socket IO stuff right. It uses synchronous fgets() to read things of > the sockets, that's still not OK, and is a major thing that is > wrong. fsckd kicks malicious/broken fsck clients which send garbage, but if you want to do the buffering explicitly I can rework the patch to do that. That is, if we actually keep fsckd in the upstream sources :-) I wouldn't like to spend time on this if you already pre-decided to kick this out, but I would ask to reconsider, and instead discuss what's wrong with the code. Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Hello Lennart, all, Lennart Poettering [2015-04-03 14:58 +0200]: > To start with, the code is really wrong, it should never have been > merged in its current state, the read/write logic for the sockets is > completely borked (I cannot even boot my own machine reliably with > it!). This is surprising indeed. If that's not just the journald/logind/D-Bus corruption (which we still haven't tracked down properly), do you have a journal of a hung boot? We never saw a boot failure due to fsck so far, so I'm naturally very interested in seeing what's wrong. > And to my knowledge there has been no attempt to fix all of that, > even though I asked for it. As far as I see, every point that came up during reviews, including your recent one about "don't route fsck output through systemd-fsck" got addressed (that latter patch hasn't been committed though, I thought you wanted to review it yourself). > It also doesn't do at all what I suggested initially, as the flow of > data is now fsck → systemd-fsck → systemd-fsckd → plymouth, and > that's just crazy, that's two steps too many. With the above patch it's fsck -> systemd-fsckd → plymouth, and I don't see how to eliminate yet another step? > Then, there's my general reservation with fsckd at all: file systems > that still require offline fsck are certainly not the future, but we > develop stuff for the future I do agree with the sentiment; let me assure you that we don't easily spend days on such stuff in vain, but it's because there are millions of existing installations out there which still do have ext4 and fsck. If systemd upstreams say "we don't care about existing products, only about a future with just btrfs" that's your prerogative of course, but distros need to have a more product-oriented focus :-/ > systemd-fsckd would try to connect to some AF_UNIX/SOCK_STREAM socket > in the fs, after forking and before execing fsck in the child, and > pass the connected socket to fsck via the -C switch. If the socket is > not connectable it would avoid any -C switch. With this simple change > you can make this work for you: simply write a daemon (outside of > systemd) that listens on that sockets and reads the progress data from > it. Using SO_PEERCRED you can query which fsck PID this is from and > use it to kill it. You could even add this to ply natively if you > wish, since it's kinda strange to bump this all off another daemon in > the middle, unnecessarily. > > Changing this would actually make it very close to my initial > suggestion, except that we would not have the receiving side for the > progress data in systemd, you'd have to maintain that externally (or > in ply). > > I hope such a solution is acceptable? The data flow is very similar to what we have now, so this mostly amounts to maintaining fsckd in the systemd sources vs. maintaining it separately in Debian/Ubuntu. I'd be interested in what RHEL/SUSE/Arch/etc. want to do. Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
B1;3802;0cOn Fri, 03.04.15 15:17, Didier Roche (didro...@ubuntu.com) wrote: > >To start with, the code is really wrong, it should never have been > >merged in its current state, the read/write logic for the sockets is > >completely borked (I cannot even boot my own machine reliably with > >it!). And to my knowledge there has been no attempt to fix all of > >that, even though I asked for it. It also doesn't do at all what I > >suggested initially, as the flow of data is now fsck → systemd-fsck → > >systemd-fsckd → plymouth, and that's just crazy, that's two steps too > >many. systemd is supposed to be a few components playing well > >together, but certainly not a baroque network of components where data > >is passed though four hoops before it reaches the destination... > > I misunderstood first what you wanted in 2011, reading back from the mailing > list. You would have noted that no comment (even on the first review which > were made) raised those points in the multiple reviews that occured, hence > it was merged. It's weird that it doesn't even boot your own machine > reliably, as we have the first implementation running on all vivid machines > by default, and it seems from the bug reports, reliably. It might have to do with the fact that our ply set up (with themes, ...) is different than Ubuntu's. > However, I'm a bit surprised about the statement that no attempt has been > done to fix it. I think you saw I have always been responsive, prioritizing > your suggestions over other work to fix them. When you did your first public > personal reserves about fsckd on the mainling list and I understood what > flow you wanted[1], I posted fixes *the day after* (with some back and force > review) to address your comments. I reviewed some of the initial patches, but please note that it was merged by Pitti before I had a final look on it. Well, the major problem (with the socket handling) are completely unfixed still, and no patch has been posted afaics that fixed that. I am aware that we didn't communicate this all properly, but Kay, Daniel, David and I only sat down the day before yesterday to come to a conclusion about all of this. > All of them were merged by other systemd hackers and some even by ourself, > but the biggest one, which directly addressed and implemented the flow of > data you explicitly asked for is still waiting: > http://lists.freedesktop.org/archives/systemd-devel/2015-March/029309.html > (Note that this was proposed less than 48 hours after your complain about > the data flow). Knowing that you were on holidays, I didn't push others too > much, but Martin and I pinged you on IRC about it when you were back. Am I > missing anything? Well, I had a brief look at this patch, but it still doesn't get the socket IO stuff right. It uses synchronous fgets() to read things of the sockets, that's still not OK, and is a major thing that is wrong. by looking at the patch I am pretty sure this all will lock up if you have multiple fsck, to the point where you cause all fscks to stop but the first one until the first once is finished, and so on... (I does remove the extra bumping off systemd-fsckd though, that's good!) > >Then, there's my general reservation with fsckd at all: file systems > >that still require offline fsck are certainly not the future, but we > >develop stuff for the future, and the idea to kill an fsck process > >while it is running is also very very questionnable. There's a reason > >why such functionality never existed on Fedora or RHEL: it's risky. I > >mean, it's all good allowing people to shoot themselves in the foot, > >but there's really *no* point in making that easy and giving it a > >fancy UI with support in the graphical boot splash. Shooting yourself > >in the foot should be possible, but not *easily*! And certainly not be > >allowed without prior authentication like you are doing it right now > >with the plymouth support. > > I can understand those points, just a little bit disappointed that wasn't > stated months ago, when we started to work on it and before the whole > refactoring… Yes, sorry for that. We should have sat down earlier, and come to a conclusion about this. Sorry for the unclear message we were sending! Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] fsckd needs to go
Le 03/04/2015 14:58, Lennart Poettering a écrit : Heya, Hey Lennart, so we discussed the whole fsckd situation a bit more here in Berlin, and we came to the conclusion that fsckd really should not exist the way it does in systemd. To start with, the code is really wrong, it should never have been merged in its current state, the read/write logic for the sockets is completely borked (I cannot even boot my own machine reliably with it!). And to my knowledge there has been no attempt to fix all of that, even though I asked for it. It also doesn't do at all what I suggested initially, as the flow of data is now fsck → systemd-fsck → systemd-fsckd → plymouth, and that's just crazy, that's two steps too many. systemd is supposed to be a few components playing well together, but certainly not a baroque network of components where data is passed though four hoops before it reaches the destination... I misunderstood first what you wanted in 2011, reading back from the mailing list. You would have noted that no comment (even on the first review which were made) raised those points in the multiple reviews that occured, hence it was merged. It's weird that it doesn't even boot your own machine reliably, as we have the first implementation running on all vivid machines by default, and it seems from the bug reports, reliably. However, I'm a bit surprised about the statement that no attempt has been done to fix it. I think you saw I have always been responsive, prioritizing your suggestions over other work to fix them. When you did your first public personal reserves about fsckd on the mainling list and I understood what flow you wanted[1], I posted fixes *the day after* (with some back and force review) to address your comments. All of them were merged by other systemd hackers and some even by ourself, but the biggest one, which directly addressed and implemented the flow of data you explicitly asked for is still waiting: http://lists.freedesktop.org/archives/systemd-devel/2015-March/029309.html (Note that this was proposed less than 48 hours after your complain about the data flow). Knowing that you were on holidays, I didn't push others too much, but Martin and I pinged you on IRC about it when you were back. Am I missing anything? Then, there's my general reservation with fsckd at all: file systems that still require offline fsck are certainly not the future, but we develop stuff for the future, and the idea to kill an fsck process while it is running is also very very questionnable. There's a reason why such functionality never existed on Fedora or RHEL: it's risky. I mean, it's all good allowing people to shoot themselves in the foot, but there's really *no* point in making that easy and giving it a fancy UI with support in the graphical boot splash. Shooting yourself in the foot should be possible, but not *easily*! And certainly not be allowed without prior authentication like you are doing it right now with the plymouth support. I can understand those points, just a little bit disappointed that wasn't stated months ago, when we started to work on it and before the whole refactoring… Thus, we decided to remove fsckd again entirely from systemd. However, if Ubuntu really wants to implement this anyway (I strongly believe that this is an absolute misfeature!), then I'd be willing to add the following for you: systemd-fsckd would try to connect to some AF_UNIX/SOCK_STREAM socket in the fs, after forking and before execing fsck in the child, and pass the connected socket to fsck via the -C switch. If the socket is not connectable it would avoid any -C switch. With this simple change you can make this work for you: simply write a daemon (outside of systemd) that listens on that sockets and reads the progress data from it. Using SO_PEERCRED you can query which fsck PID this is from and use it to kill it. You could even add this to ply natively if you wish, since it's kinda strange to bump this all off another daemon in the middle, unnecessarily. Changing this would actually make it very close to my initial suggestion, except that we would not have the receiving side for the progress data in systemd, you'd have to maintain that externally (or in ply). Not sure we are going so close to vivid finale, changing it again. We did implement all your suggestions and fixed it to match those. I'm feeling a little bit uneasy about how all this turned out, showing such good willing to get it contributed upstream we put into it, but if that's the fate of it… Didier [1] http://lists.freedesktop.org/archives/systemd-devel/2015-March/029186.html ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] fsckd needs to go
Heya, so we discussed the whole fsckd situation a bit more here in Berlin, and we came to the conclusion that fsckd really should not exist the way it does in systemd. To start with, the code is really wrong, it should never have been merged in its current state, the read/write logic for the sockets is completely borked (I cannot even boot my own machine reliably with it!). And to my knowledge there has been no attempt to fix all of that, even though I asked for it. It also doesn't do at all what I suggested initially, as the flow of data is now fsck → systemd-fsck → systemd-fsckd → plymouth, and that's just crazy, that's two steps too many. systemd is supposed to be a few components playing well together, but certainly not a baroque network of components where data is passed though four hoops before it reaches the destination... Then, there's my general reservation with fsckd at all: file systems that still require offline fsck are certainly not the future, but we develop stuff for the future, and the idea to kill an fsck process while it is running is also very very questionnable. There's a reason why such functionality never existed on Fedora or RHEL: it's risky. I mean, it's all good allowing people to shoot themselves in the foot, but there's really *no* point in making that easy and giving it a fancy UI with support in the graphical boot splash. Shooting yourself in the foot should be possible, but not *easily*! And certainly not be allowed without prior authentication like you are doing it right now with the plymouth support. Thus, we decided to remove fsckd again entirely from systemd. However, if Ubuntu really wants to implement this anyway (I strongly believe that this is an absolute misfeature!), then I'd be willing to add the following for you: systemd-fsckd would try to connect to some AF_UNIX/SOCK_STREAM socket in the fs, after forking and before execing fsck in the child, and pass the connected socket to fsck via the -C switch. If the socket is not connectable it would avoid any -C switch. With this simple change you can make this work for you: simply write a daemon (outside of systemd) that listens on that sockets and reads the progress data from it. Using SO_PEERCRED you can query which fsck PID this is from and use it to kill it. You could even add this to ply natively if you wish, since it's kinda strange to bump this all off another daemon in the middle, unnecessarily. Changing this would actually make it very close to my initial suggestion, except that we would not have the receiving side for the progress data in systemd, you'd have to maintain that externally (or in ply). I hope such a solution is acceptable? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel