[REGRESSION] 5.8-rc3: seccomp crash with Chromium, QtWebEngine and related browsers: seccomp-bpf failure in syscall 0072
Dear Andy, Kees, Will, dear kernel community, With 5.8-rc3 there is a seccomp related crash which prevents Chromium and QtWebEngine from starting: Bug 208369 - seccomp crash with Chromium, QtWebEngine and related browsers: seccomp-bpf failure in syscall 0072 https://bugzilla.kernel.org/show_bug.cgi?id=208369 Reverting to 5.8-rc2 fixes the issue. Best, -- Martin
Re: Linux 5.3-rc8
Dear Lennart. Lennart Poettering - 18.09.19, 15:53:25 CEST: > On Mi, 18.09.19 00:10, Martin Steigerwald (mar...@lichtvoll.de) wrote: > > > getrandom() will never "consume entropy" in a way that will block > > > any > > > users of getrandom(). If you don't have enough collected entropy > > > to > > > seed the rng, getrandom() will block. If you do, getrandom() will > > > generate as many numbers as you ask it to, even if no more entropy > > > is > > > ever collected by the system. So it doesn't matter how many > > > clients > > > you have calling getrandom() in the boot process - either there'll > > > be > > > enough entropy available to satisfy all of them, or there'll be > > > too > > > little to satisfy any of them. > > > > Right, but then Systemd would not use getrandom() for initial > > hashmap/ UUID stuff since it > > Actually things are more complex. In systemd there are four classes of > random values we need: > > 1. High "cryptographic" quality. There are very few needs for this in […] > 2. High "non-cryptographic" quality. This is used for example for […] > 3. Medium quality. This is used for seeding hash tables. These may be […] > 4. Crap quality. There are only a few uses of this, where rand_r() is >is OK. > > Of these four case, the first two might block boot. Because the first > case is not common you won't see blocking that often though for > them. The second case is very common, but since we use RDRAND you > won't see it on any recent Intel machines. > > Or to say this all differently: the hash table seeding and the uuid > case are two distinct cases in systemd, and I am sure they should be. Thank you very much for your summary of uses of random numbers in Systemd and also for your other mail that "neither RDRAND nor /dev/ urandom know a concept of of "depleting entropy"". I thought they would deplete entropy needed to the initial seeding of crng. Thank you also for taking part in this discussion, even if someone put your mail address on carbon copy without asking with. I do not claim I understand enough of this random number stuff. But I feel its important that kernel and userspace developers actually talk with each other about a sane approach for it. And I believe that the complexity involved is part of the issue. I feel an API for attaining random number with different quality levels needs to be much, much, much more simple to use *properly*. I felt a bit overwhelmed by the discussion (and by what else is happening in my life, just having come back from holding a Linux performance workshop in front of about two dozen people), so I intend to step back from it. If one of my mails actually helped to encourage or facilitate kernel space and user space developers talking with each other about a sane approach to random numbers, then I may have used my soft skills in a way that brings some benefit. For the technical aspects certainly people are taking part in this discussion who are much much deeper into the intricacies of entropy in Linux and computers in general, so I just hope for a good outcome. Best, -- Martin
Re: Linux 5.3-rc8
Matthew Garrett - 17.09.19, 23:52:00 CEST: > On Tue, Sep 17, 2019 at 11:38:33PM +0200, Martin Steigerwald wrote: > > My understanding of entropy always has been that only a certain > > amount of it can be produced in a certain amount of time. If that > > is wrong… please by all means, please teach me, how it would be. > > getrandom() will never "consume entropy" in a way that will block any > users of getrandom(). If you don't have enough collected entropy to > seed the rng, getrandom() will block. If you do, getrandom() will > generate as many numbers as you ask it to, even if no more entropy is > ever collected by the system. So it doesn't matter how many clients > you have calling getrandom() in the boot process - either there'll be > enough entropy available to satisfy all of them, or there'll be too > little to satisfy any of them. Right, but then Systemd would not use getrandom() for initial hashmap/ UUID stuff since it 1) would block boot very early then, which is not desirable and 2) it does not need strong random numbers anyway. At least that is how I understood Lennart's comments on the Systemd bug report I referenced. AFAIK hashmap/UUID stuff uses *some* entropy *before* crng has been seeded with entropy and all I wondered was whether this using *some* entropy *before* crng has been seeded – by /dev/urandom initially, but now as far as I got with RDRAND if available – will delay the process of gathering the entropy necessary to seed crng… if that is the case then anything that uses crng during or soon after boot, like gdm, sddm, OpenSSH ssh-keygen will be blocked for a longer time will the initial seeding of crng has been done. Of course if hashmap/UUID stuff does not use any entropy that would be required for the *initial* seeding or crng, then… that would not be the case. But from what I understood, it does. And yes, for "systemd-random-seed" it is true that it does not drain entropy for getrandom, cause it writes the seed to disk *after* crng has been initialized, i.e. at a time where getrandom would never block again as long as the system is running. If I am still completely misunderstanding something there, then it may be better to go to sleep. Which I will do now anyway. Or I may just not be very good at explaining what I mean. -- Martin
Re: Linux 5.3-rc8
Ahmed S. Darwish - 17.09.19, 22:52:34 CEST: > On Tue, Sep 17, 2019 at 10:28:47PM +0200, Martin Steigerwald wrote: > [...] > > > I don't have any kernel logs old enough to see whether whether crng > > init times have been different with Systemd due to asking for > > randomness for UUID/hashmaps. > > Please stop claiming this. It has been pointed out to you, __multiple > times__, that this makes no difference. For example: > > https://lkml.kernel.org/r/20190916024904.ga22...@mit.edu > > No. getrandom(2) uses the new CRNG, which is either initialized, > or it's not ... So to the extent that systemd has made systems > boot faster, you could call that systemd's "fault". > > You've claimed this like 3 times before in this thread already, and > multiple people replied with the same response. If you don't get the > paragraph above, then please don't continue replying further on this > thread. First off, this mail you referenced has not been an answer to a mail of mine. It does not have my mail address in Cc. So no, it has not been pointed out directly to me in that mail. Secondly: Pardon me, but I do not see how asking for entropy early at boot times or not doing so has *no effect* on the available entropy¹. And I do not see the above mail actually saying this. To my knowledge Sysvinit does not need entropy for itself². The above mail merely talks about the blocking on boot. And whether systemd-random-seed would drain entropy, not whether hashmaps/UUID do. And also not the effect that asking for entropy early has on the available entropy and on the *initial* initialization time of the new CRNG. However I did not claim that Systemd would block booting. *Not at all*. Thirdly: I disagree with the tone you use in your mail. And for that alone I feel it may be better for me to let go of this discussion. My understanding of entropy always has been that only a certain amount of it can be produced in a certain amount of time. If that is wrong… please by all means, please teach me, how it would be. However I am not even claiming anything. All I wrote above is that I do not have any measurements. But I'd expect that the more entropy is asked for early during boot, the longer the initial initialization of the new CRNG will take. And if someone else relies on this initialization, that something else would block for a longer time. I got that it the new crng won't block after that anymore. [1] https://github.com/systemd/systemd/issues/4167 (I know that it still with /dev/urandom, so if it is using RDRAND now, this may indeed be different, but would it then deplete entropy the CPU has available and that by default is fed into the Linux crng as well (even without trusting it completely)?) [2] According to https://daniel-lange.com/archives/152-Openssh-taking-minutes-to-become-available,-booting-takes-half-an-hour-...-because-your-server-waits-for-a-few-bytes-of-randomness.html sysvinit does not contain a single line of code about entropy or random numbers. Daniel even updated his blog post with a hint to this discussion. Thanks, -- Martin
Re: Linux 5.3-rc8
Willy Tarreau - 17.09.19, 19:29:29 CEST: > On Tue, Sep 17, 2019 at 07:13:28PM +0200, Lennart Poettering wrote: > > On Di, 17.09.19 18:21, Willy Tarreau (w...@1wt.eu) wrote: > > > On Tue, Sep 17, 2019 at 05:57:43PM +0200, Lennart Poettering > > > wrote: > > > > Note that calling getrandom(0) "too early" is not something > > > > people do > > > > on purpose. It happens by accident, i.e. because we live in a > > > > world > > > > where SSH or HTTPS or so is run in the initrd already, and in a > > > > world > > > > where booting sometimes can be very very fast. > > > > > > It's not an accident, it's a lack of understanding of the impacts > > > from the people who package the systems. Generating an SSH key > > > from > > > an initramfs without thinking where the randomness used for this > > > could come from is not accidental, it's a lack of experience that > > > will be fixed once they start to collect such reports. And those > > > who absolutely need their SSH daemon or HTTPS server for a > > > recovery image in initramfs can very well feed fake entropy by > > > dumping whatever they want into /dev/random to make it possible > > > to build temporary keys for use within this single session. At > > > least all supposedly incorrect use will be made *on purpose* and > > > will still be possible to match what users need.> > > What do you expect these systems to do though? > > > > I mean, think about general purpose distros: they put together live > > images that are supposed to work on a myriad of similar (as in: same > > arch) but otherwise very different systems (i.e. VMs that might lack > > any form of RNG source the same as beefy servers with muliple > > sources > > the same as older netbooks with few and crappy sources, ...). They > > can't know what the specific hw will provide or won't. It's not > > their incompetence that they build the image like that. It's a > > common, very common usecase to install a system via SSH, and it's > > also very common to have very generic images for a large number > > varied systems to run on. > > I'm totally file with installing the system via SSH, using a temporary > SSH key. I do make a strong distinction between the installation > phase and the final deployment. The SSH key used *for installation* > doesn't need to the be same as the final one. And very often at the > end of the installation we'll have produced enough entropy to produce > a correct key. Well… systems cloud-init adapts may come from the same template. Cloud Init thus replaces the key that has been there before on their first boot. There is no "installation". Cloud Init could replace the key in the background… and restart SSH then… but that will give those big fat man in the middle warnings and all systems would use the same SSH host key initially. I just don't see a good way at the moment how to handle this. Introducing an SSH mode for this is still a temporary not so random key with proper warnings might be challenging to get right from both a security and usability point of view. And it would add complexity. That said with Proxmox VE on Fujitsu S8 or Intel NUCs I have never seen this issue even when starting 50 VMs in a row, however, with large cloud providers starting 50 VMs in a row does not sound like all that much. And I bet with Proxmox VE virtio rng is easily available cause it uses KVM. -- Martin
Re: Linux 5.3-rc8
Willy Tarreau - 17.09.19, 18:21:37 CEST: > On Tue, Sep 17, 2019 at 05:57:43PM +0200, Lennart Poettering wrote: > > Note that calling getrandom(0) "too early" is not something people > > do > > on purpose. It happens by accident, i.e. because we live in a world > > where SSH or HTTPS or so is run in the initrd already, and in a > > world > > where booting sometimes can be very very fast. > > It's not an accident, it's a lack of understanding of the impacts > from the people who package the systems. Generating an SSH key from > an initramfs without thinking where the randomness used for this could > come from is not accidental, it's a lack of experience that will be > fixed once they start to collect such reports. And those who > absolutely need their SSH daemon or HTTPS server for a recovery image > in initramfs can very well feed fake entropy by dumping whatever they > want into /dev/random to make it possible to build temporary keys for > use within this single session. At least all supposedly incorrect use > will be made *on purpose* and will still be possible to match what > users need. Well I wondered before whether SSH key generation for cloud init or other automatically individualized systems could happen in the background. Replacing a key that would be there before it would be replaced. So SSH would be available *before* the key is regenerated. But then there are those big fast man in the middle warnings… and I have no clear idea to handle this in a way that would both be secure and not scare users off too much. Well probably systems at some point better have good entropy very quickly… and that is it. (And then quantum computers may crack those good keys anyway in the future.) -- Martin
Re: Linux 5.3-rc8
Linus Torvalds - 17.09.19, 20:01:23 CEST: > > We can make boot hang in "sane", discoverable way. > > That is certainly a huge advantage, yes. Right now I suspect that what > has happened is that this has probably been going on as some > low-level background noise for a while, and people either figured it > out and switched away from gdm (example: Christoph), or more likely > some unexplained boot problems that people just didn't chase down. So > it took basically a random happenstance to make this a kernel issue. > > But "easily discoverable" would be good. Well I meanwhile remembered how it was with sddm: Without CPU assistance (RDRAND) or haveged or any other source of entropy, sddm would simply not appear and I'd see the tty1 login. Then I start to type something and after a while sddm popped up. If I would not type anything it took easily at least have a minute till it appeared. Actually I used my system like this quite a while, cause I did not feel comfortable with haveged and RDRAND. AFAIR this was as this Debian still ran with Systemd. What Debian maintainer for sddm did was this: sddm (0.18.0-1) unstable; urgency=medium […] [ Maximiliano Curia ] * Workaround entropy starvation by recommending haveged * Release to unstable -- Maximiliano Curia […] Sun, 22 Jul 2018 13:26:44 +0200 With Sysvinit I still have neither haveged nor RDRAND enabled, but behavior changed a bit. crng init still takes a while % zgrep -h "crng init" /var/log/kern.log* Sep 16 09:06:23 merkaba kernel: [ 16.910096][C3] random: crng init done Sep 8 14:08:39 merkaba kernel: [ 16.682014][C2] random: crng init done Sep 9 09:16:43 merkaba kernel: [ 46.084188][C2] random: crng init done Sep 11 10:52:37 merkaba kernel: [ 47.209825][C3] random: crng init done Sep 12 08:32:08 merkaba kernel: [ 76.624375][C3] random: crng init done Sep 12 20:07:29 merkaba kernel: [ 10.726349][C2] random: crng init done Sep 8 10:02:42 merkaba kernel: [ 37.391577][C2] random: crng init done Aug 26 09:23:51 merkaba kernel: [ 40.555337][C3] random: crng init done Aug 28 09:45:28 merkaba kernel: [ 39.446847][C1] random: crng init done Aug 20 10:14:59 merkaba kernel: [ 12.242467][C1] random: crng init done and there might be a slight delay before sddm appears, before tty has been initialized. I am not completely sure whether it is related to sddm or something else. But AFAIR delays have been in the range of a maximum of 5-10 seconds, so I did not bother to check more closely. Note this is on a ThinkPad T520 which is a PC. And if I read above kernel log excerpts right, it can still take up to 76 second for crng to be initialized with entropy. Would be interesting to see other people's numbers there. There might be a different ordering with Sysvinit and it may still be sddm. But I never have seen a delay of 76 seconds AFAIR… so something else might be different or I just did not notice the delay. Sometimes I switch on the laptop and do something else to come back in a minute or so. I don't have any kernel logs old enough to see whether whether crng init times have been different with Systemd due to asking for randomness for UUID/hashmaps. Thanks, -- Martin
Re: Linux 5.3-rc8
Willy Tarreau - 17.09.19, 10:35:16 CEST: > On Tue, Sep 17, 2019 at 09:33:40AM +0200, Martin Steigerwald wrote: > > However this again would be burdening users with an issue they > > should > > not have to care about. Unless userspace developers care enough and > > manage to take time to fix the issue before updated kernels come to > > their systems. Cause again it would be users systems that would not > > be working. Just cause kernel and userspace developers did not > > agree and chose to fight with each other instead of talking *with* > > each other. > It has nothing to do with fighting at all, it has to do with offering > what applications *need* without breaking existing assumptions that > make most applications work. And more importantly it involves not […] Well I got the impression or interpretation that it would be about fighting… if it is not, all the better! > > At least with killing gdm Systemd may restart it if configured to do > > so. But if it doesn't, the user is again stuck with a non working > > system until restarting gdm themselves. > > > > It may still make sense to make the API harder to use, > > No. What is hard to use is often misused. It must be harder to misuse > it, which means it should be easier to correctly use it. The choice of > flag names and the emission of warnings definitely helps during the > development stage. Sorry, this was a typo of mine. I actually meant harder to abuse. Anything else would not make sense in the context of what I have written. Make it easier to use properly and harder to abuse. > > but it does not > > replace talking with userspace developers and it would need some > > time to allow for adapting userspace applications and services. > > Which is how adding new flags can definitely help even if adoption > takes time. By the way in this discussion I am a userspace developer > and have been hit several times by libraries switching to getrandom() > that silently failed to respond in field. As a userspace developer, I > really want to see a solution to this problem. And I'm fine if the > kernel decides to kill haproxy for using getrandom() with the old > settings, at least users will notice, will complain to me and will > update. Good to see that you are also engaging as a userspace developer in the discussion. Thanks, -- Martin
Re: Linux 5.3-rc8
Willy Tarreau - 17.09.19, 07:24:38 CEST: > On Mon, Sep 16, 2019 at 06:46:07PM -0700, Matthew Garrett wrote: > > >Well, the patch actually made getrandom() return en error too, but > > >you seem more interested in the hypotheticals than in arguing > > >actualities.> > > If you want to be safe, terminate the process. > > This is an interesting approach. At least it will cause bug reports in > application using getrandom() in an unreliable way and they will > check for other options. Because one of the issues with systems that > do not finish to boot is that usually the user doesn't know what > process is hanging. A userspace process could just poll on the kernel by forking a process to use getrandom() and waiting until it does not get terminated anymore. And then it would still hang. So yes, that would it make it harder to abuse the API, but not impossible. Which may still be good, I don't know. Either the kernel does not reveal at all whether it has seeded CRNG and leaves GnuPG, OpenSSH and others in the dark, or it does and risk that userspace does stupid things whether it prints a big fat warning or not. Of course the warning could be worded like: process blocking on entropy too early on boot without giving the kernel much chance to gather entropy. this is not a kernel issue, report to userspace developers And probably then kill the process, so at least users will know. However this again would be burdening users with an issue they should not have to care about. Unless userspace developers care enough and manage to take time to fix the issue before updated kernels come to their systems. Cause again it would be users systems that would not be working. Just cause kernel and userspace developers did not agree and chose to fight with each other instead of talking *with* each other. At least with killing gdm Systemd may restart it if configured to do so. But if it doesn't, the user is again stuck with a non working system until restarting gdm themselves. It may still make sense to make the API harder to use, but it does not replace talking with userspace developers and it would need some time to allow for adapting userspace applications and services. -- Martin
a sane approach to random numbers (was: Re: Linux 5.3-rc8)
As this is not about Linux 5.3-rc8 anymore I took the liberty to change the subject. Linus Torvalds - 17.09.19, 01:05:47 CEST: > On Mon, Sep 16, 2019 at 4:02 PM Matthew Garrett > wrote: > > The semantics many people want for secure key generation is urandom, > > but with a guarantee that it's seeded. > > And that is exactly what I'd suggest GRND_SECURE should do. > > The problem with: > > getrandom()'s default behaviour at present provides that > > is that exactly because it's the "default" (ie when you don't pass any > flags at all), that behavior is what all the random people get who do > *not* really intentionally want it, they just don't think about it. > > Changing the default (even with kernel warnings) seems like > > it risks people generating keys from an unseeded prng, and that > > seems > > like a bad thing? > > I agree that it's a horrible thing, but the fact that the default 0 > behavior had that "wait for entropy" is what now causes boot problems > for people. Seeing all the discussion, I just got the impression that it may be best to start from scratch. To stop trying to fix something that was broken to begin with – at least that was what I got from the discussion here. Do a sane API with new function names, new flag names and over time deprecate the old one completely so that one day it hopefully could be gradually disabled until it can be removed. Similar like with statx() replacing stat() someday hopefully. And do some documentation about how it is to be used by userspace developers. I.e. like: If the kernel says it is not random, do not block and poll on it, but do something to generate entropy. But maybe that is naive, too. However, in the end, what ever you kernel developers will come up with, I bet there will be no way to make the kernel control userspace developers. However I have the impression that that is what you attempt to do here. As long as you have an API to obtain guaranteed random numbers or at least somewhat guaranteed random numbers that is not directly available at boot time, userspace could poll on its availability. At least as long as the kernel would be honest about its unavailability and tell about it. And if it doesn't applications that *require* random numbers can never know whether they got some from the kernel. Maybe you can make an API that is hard to abuse, yes. And that is good. But impossible? I wonder: How could the Linux experience look like if kernel developers and userspace developers actually work together instead of finding ways to fight each other? I mean, for the most common userspace applications in the free software continuum, there would not be all that many people to talk with, or would there? It is basically gdm, sddm, some other display managers probably, SSH, GnuPG and probably a few more. For example for gdm someone could open a bug report about its use of the current API and ask it to use something that is non blocking? And does Systemd really need to deplete the random pool early at boot in order to generate UUIDs? Even tough I do not use GNOME I'd be willing to help with doing a few bug reports there and there. AFAIR there has been something similar with sddm which I used, but I believe there it has been fixed already with sddm. Sometimes I wonder what would happen if kernel and userspace developers actually *talk* to each other, or better *with* each other. But instead for example with Lennart appears to be afraid to interact with the kernel community and some kernel developers just talked about personalities that they find difficult to interact it, judging them to be like this and like that. There is a social, soft skill issue here that no amount of technical excellence will resolve. That is at least how I observe this. Does it make it easier? Probably not. I fully appreciate that some people may have a difficult time to talk with each other, I experienced this myself often enough. I did not report a bug report with Systemd I found recently just cause I do not like to repeat the experience I had when I reported bugs about it before and I do not use it anymore personally anyway. So I totally get that. However… not talking with each other is not going to resolve those userspace uses kernel API in a way kernel developers do not agree with and that causes issues like stalled boots. Cause basically userspace can abuse any kernel API and in the end the kernel can do nothing about it. Of course feel free to ignore this, if you think it is not useful. Thanks, -- Martin
Re: Linux 5.3-rc8
Ahmed S. Darwish - 14.09.19, 23:11:26 CEST: > > Yeah, the above is yet another example of completely broken garbage. > > > > You can't just wait and block at boot. That is simply 100% > > unacceptable, and always has been, exactly because that may > > potentially mean waiting forever since you didn't do anything that > > actually is likely to add any entropy. > > ACK, the systemd commit which introduced that code also does: > >=> 26ded5570994 (random-seed: rework systemd-random-seed.service..) > [...] > --- a/units/systemd-random-seed.service.in > +++ b/units/systemd-random-seed.service.in > @@ -22,4 +22,9 @@ Type=oneshot > RemainAfterExit=yes > ExecStart=@rootlibexecdir@/systemd-random-seed load > ExecStop=@rootlibexecdir@/systemd-random-seed save >-TimeoutSec=30s >+ >+# This service waits until the kernel's entropy pool is >+# initialized, and may be used as ordering barrier for service >+# that require an initialized entropy pool. Since initialization >+# can take a while on entropy-starved systems, let's increase the >+# time-out substantially here. >+TimeoutSec=10min > > This 10min wait thing is really broken... it's basically "forever". I am so happy to use Sysvinit on my systems again. Depending on entropy for just booting a machine is broken¹. Of course regenerating SSH keys on boot, probably due to cloud-init replacing the old key after a VM has been cloned from template, may still be a challenge to handle well². I'd probably replace SSH keys in the background and restart the service then, but this may lead to spurious man in the middle warnings. [1] Debian Buster release notes: 5.1.4. Daemons fail to start or system appears to hang during boot https://www.debian.org/releases/stable/amd64/release-notes/ch-information.en.html#entropy-starvation [2] Openssh taking minutes to become available, booting takes half an hour ... because your server waits for a few bytes of randomness https://daniel-lange.com/archives/152-hello-buster.html Thanks, -- Martin
Re: DRM-based Oops viewer
Hell Ahmed. Ahmed S. Darwish - 10.03.19, 02:31: > Hello DRM/UEFI maintainers, > > Several years ago, I wrote a set of patches to dump the kernel > log to disk upon panic -- through BIOS INT 0x13 services. [1] > > The overwhelming response was that it's unsafe to do this in a > generic manner. Linus proposed a video-based viewer instead: [2] […] > Of course it's 2019 now though, and it's quite known that > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3] […] > The maximum possible that UEFI can provide is a GOP-provided > framebuffer that's ready to use by the OS -- even after the UEFI > boot phase is marked as done through ExitBootServices(). [5] > > Of course, once native drivers like i915 or radeon take over, > such a framebuffer is toast... [6] > > Thus a possible remaining option, is to display the oops through > "minimal" DRM drivers provided for each HW variant... Since > these special drivers will run only and fully under a panic() > context though, several constraints exist: Thank you for your idea and willingness to work on something like this. As a user I'd very much favor a solution that could not only work with UEFI but with other firmwares. I still dream to be able to buy a laptop with up to date hardware and with Coreboot/Libreboot at some time. While this would not solve all "I just freeze" kind of crashes, it may at least give some information about some of them. When testing rc kernels I often enough faced "I just freeze" crashes that just happened *sometimes*. On a machine that I also use for production work I find it infeasible to debug it as bisecting could take a long, long time. And well the machine could just crash every moment… even during doing important work with it. In my ideal world an operating system would never ever crash or hang without telling why. Well it would not crash or hang at all… but there you go. Maybe some time with a widely usable micro kernel based OS that can restart device drivers in a broken state – at least almost. No discussion of that micro kernel topic required here. :) Thanks, -- Martin
Re: Bug#919356: Licensing of include/linux/hash.h
On 2/11/19 11:27 PM, Ben Finney wrote: > Martin Steigerwald writes: > >> Well the file has in its header: >> >> /* Fast hashing routine for a long. >>(C) 2002 William Lee Irwin III, IBM */ >> >> /* >> * Knuth recommends primes in approximately golden ratio to the maximum >> * integer representable by a machine word for multiplicative hashing. >> * Chuck Lever verified the effectiveness of this technique: >> * http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf >> * >> * These primes are chosen to be bit-sparse, that is operations on >> * them can use shifts and additions instead of multiplications for >> * machines where multiplications are slow. >> */ >> >> It has been quite a while ago. I bet back then I did not regard this >> as license information since it does not specify a license. Thus I >> assumed it to be GPL-2 as the other files which have no license boiler >> plate. I.e.: Check file is it has different license, if not, then >> assume it has license as specified in COPYING. >> >> Not specifying a license can however also mean in this context that it >> has no license as the file contains copyright information from another >> author. > > If a work (even one file) “has no license”, that means no special > permissions are granted and normal copyright applies: All rights > reserved, i.e. not redistributable. So, no license is grounds to > consider a work non-free and non-redistributable. > > If, on the other hand, the file is to be free software, there would need > to be a clear grant of some free software license to that work. > > Given the confusion over this file, I would consider it a significant > risk to just assume we have GPLv2 permissions without being told that > explicitly by the copyright holder. Rather, the reason we are seeking a > clearly-granted free license for this one file, is because we are trying > to replace a probably non-free file with the same code in it. > > It seems we need to keep looking, and in the meantime assume we have no > free license in this file. FWIW, fio.c includes the following mention: * The license below covers all files distributed with fio unless otherwise * noted in the file itself. followed by the GPL v2 license. I'll go through and add SPDX headers to everything to avoid wasting anymore time on this nonsense. -- Jens Axboe
Re: Bug#919356: Licensing of include/linux/hash.h
Jens Axboe - 12.02.19, 17:16: > On 2/11/19 11:27 PM, Ben Finney wrote: > > Martin Steigerwald writes: > >> Well the file has in its header: > >> > >> /* Fast hashing routine for a long. > >> > >>(C) 2002 William Lee Irwin III, IBM */ > >> > >> /* > >> > >> * Knuth recommends primes in approximately golden ratio to the > >> maximum * integer representable by a machine word for > >> multiplicative hashing. * Chuck Lever verified the effectiveness > >> of this technique: > >> * http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf > >> * > >> * These primes are chosen to be bit-sparse, that is operations on > >> * them can use shifts and additions instead of multiplications for > >> * machines where multiplications are slow. > >> */ > >> > >> It has been quite a while ago. I bet back then I did not regard > >> this > >> as license information since it does not specify a license. Thus I > >> assumed it to be GPL-2 as the other files which have no license > >> boiler plate. I.e.: Check file is it has different license, if > >> not, then assume it has license as specified in COPYING. > >> > >> Not specifying a license can however also mean in this context that > >> it has no license as the file contains copyright information from > >> another author. > > > > If a work (even one file) “has no license”, that means no special > > permissions are granted and normal copyright applies: All rights > > reserved, i.e. not redistributable. So, no license is grounds to > > consider a work non-free and non-redistributable. > > > > If, on the other hand, the file is to be free software, there would > > need to be a clear grant of some free software license to that > > work. > > > > Given the confusion over this file, I would consider it a > > significant > > risk to just assume we have GPLv2 permissions without being told > > that > > explicitly by the copyright holder. Rather, the reason we are > > seeking a clearly-granted free license for this one file, is > > because we are trying to replace a probably non-free file with the > > same code in it. > > > > It seems we need to keep looking, and in the meantime assume we have > > no free license in this file. > > FWIW, fio.c includes the following mention: > > * The license below covers all files distributed with fio unless > otherwise * noted in the file itself. > > followed by the GPL v2 license. I'll go through and add SPDX headers > to everything to avoid wasting anymore time on this nonsense. Thank you, Jens, for settling this. I did not remember that one. It may very well be that I have seen this note as I initially packaged fio as my first package for Debian about 10 years ago. I forwarded your mail and the one from Domenico with the SPDX patch to Debian bug #922112 fio: hash.h is not DFSG compliant https://bugs.debian.org/922112 which I closed before as you told already that hash.c is GPL-2. Thanks, -- Martin
Re: [REGRESSION] 5.0-rc2: iptables -nvL consumes 100% of CPU and hogs memory with kernel 5.0-rc2
Florian Westphal - 15.01.19, 11:15: > Michal Kubecek wrote: > > > I upgraded to self-compiled 5.0-rc2 today and found the machine to > > > be slow after startup. I saw iptables consuming 100% CPU, it only > > > responded to SIGKILL. It got restarted several times, probably by > > > some systemd service. > > > > > > Then I started 'iptables -nvL' manually. And I got this: > > > > > > % strace -p 5748 > > > [… tons more, in what appeared an endless loop …] > > This is fixed by: > > http://patchwork.ozlabs.org/patch/1024772/ > ("netfilter: nf_tables: Fix for endless loop when dumping ruleset"). Thanks, Florian. Will wait for first 5.0-rcx with x=>2 that contains the fix. Bug closed on Debian side already, was premature to report it there. Ciao, -- Martin
[REGRESSION] 5.0-rc2: iptables -nvL consumes 100% of CPU and hogs memory with kernel 5.0-rc2
Hi! Does that ring a bell with someone? For now I just downgraded, no time for detailed analysis. Debian bug report at: iptables -nvL consumes 100% of CPU and hogs memory with kernel 5.0-rc2 https://bugs.debian.org/919325 4.20 works, 5.0-rc2 showed this issue with iptables. Configurations attached. Excerpt from Debian bug report follows: I upgraded to self-compiled 5.0-rc2 today and found the machine to be slow after startup. I saw iptables consuming 100% CPU, it only responded to SIGKILL. It got restarted several times, probably by some systemd service. Then I started 'iptables -nvL' manually. And I got this: % strace -p 5748 [… tons more, in what appeared an endless loop …] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=}, msg_namelen=12, msg_iov=[{iov_base={{len=372, type=0xa06 /* NLMSG_??? */, flags=NLM_F_MULTI|0x800, seq=0, pid=5748}, "\x02\x00\x00\x11\x0b\x00\x01\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x0b\x00\x02\x00\x4f\x55\x54\x50\x55\x54\x00\x00\x0c\x00\x03\x00"...}, iov_len=16536}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 372 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=}, msg_namelen=12, msg_iov=[{iov_base={{len=372, type=0xa06 /* NLMSG_??? */, flags=NLM_F_MULTI|0x800, seq=0, pid=5748}, "\x02\x00\x00\x11\x0b\x00\x01\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x0b\x00\x02\x00\x4f\x55\x54\x50\x55\x54\x00\x00\x0c\x00\x03\x00"...}, iov_len=16536}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 372 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=}, msg_namelen=12, msg_iov=[{iov_base={{len=372, type=0xa06 /* NLMSG_??? */, flags=NLM_F_MULTI|0x800, seq=0, pid=5748}, "\x02\x00\x00\x11\x0b\x00\x01\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x0b\x00\x02\x00\x4f\x55\x54\x50\x55\x54\x00\x00\x0c\x00\x03\x00"...}, iov_len=16536}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 372 recvmsg(3, ^C{msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=}, msg_namelen=12, msg_iov=[{iov_base={{len=372, type=0xa06 /* NLMSG_??? */, flags=NLM_F_MULTI|0x800, seq=0, pid=5748}, "\x02\x00\x00\x11\x0b\x00\x01\x00\x66\x69\x6c\x74\x65\x72\x00\x00\x0b\x00\x02\x00\x4f\x55\x54\x50\x55\x54\x00\x00\x0c\x00\x03\x00"...}, iov_len=16536}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 372 strace: Process 5748 detached and this (output from atop): PID TID MINFLT MAJFLTVSTEXT VSLIBSVDATAVSTACKVSIZE RSIZEPSIZE VGROWRGROW SWAPSZRUID EUIDMEM CMD1/16 11575 -615520 152K2324K 5.0G 132K 5.1G 5.1G 0K240.4M 240.5M 0Kroot root33% iptables I had it growing till 10 GiB before I stopped it by SIGKILL to prevent excessive swapping. I will attach kernel configuration. That is all I am willing to spend time on for now before going to sleep. I will however reboot with older 4.20 kernel to see whether it is kernel related. […] -- System Information: Debian Release: buster/sid […] Kernel: Linux 5.0.0-rc2-tp520 (SMP w/4 CPU cores; PREEMPT) Thanks, -- Martin config-4.20.0-tp520.xz Description: application/xz config-5.0.0-rc2-tp520.xz Description: application/xz
Re: [PATCH v2] Document /proc/pid PID reuse behavior
Michal Hocko - 07.11.18, 17:00: > > > otherwise anybody could simply DoS the system > > > by consuming all available pids. > > > > People can do that today using the instrument of terror widely known > > as fork(2). The only thing standing between fork(2) and a full > > process table is RLIMIT_NPROC. > > not really. If you really do care about pid space depletion then you > should use pid cgroup controller. Its not quite on-topic, but I am curious now: AFAIK PID limit is 16 bits. Right? Could it be raised to 32 bits? I bet it would be a major change throughout different parts of the kernel. 16 bits sound a bit low these days, not only for PIDs, but also for connections / ports. -- Martin
Re: [PATCH v2] Document /proc/pid PID reuse behavior
Michal Hocko - 07.11.18, 17:00: > > > otherwise anybody could simply DoS the system > > > by consuming all available pids. > > > > People can do that today using the instrument of terror widely known > > as fork(2). The only thing standing between fork(2) and a full > > process table is RLIMIT_NPROC. > > not really. If you really do care about pid space depletion then you > should use pid cgroup controller. Its not quite on-topic, but I am curious now: AFAIK PID limit is 16 bits. Right? Could it be raised to 32 bits? I bet it would be a major change throughout different parts of the kernel. 16 bits sound a bit low these days, not only for PIDs, but also for connections / ports. -- Martin
Re: [REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
This regression is gone with 4.19-rc8. Thanks, Martin Martin Steigerwald - 11.09.18, 09:53: […] > Linus Torvalds - 02.09.18, 23:45: > > As usual, the rc2 release is pretty small. People are taking a > > With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes hangs > with black screen when resuming from suspend or hibernation. With > 4.18.1 it did not. Of course there have been userspace related updates > that could be related. > > I currently have no time to dig into this and on this production > laptop I generally do not do bisects between major kernel releases. > So currently I only answer questions that do not require much time to > answer. > > For now I switched back to 4.18. If that is stable – and thus likely > no userspace component is related –, I go with 4.19-rc3 or whatever > is most recent version to see if the issue has been fixed already. > > % inxi -z -b -G > System:Host: […] Kernel: 4.18.1-tp520-btrfstrim x86_64 bits: 64 > Desktop: KDE Plasma 5.13.5 >Distro: Debian GNU/Linux buster/sid > Machine: Type: Laptop System: LENOVO product: 42433WG v: ThinkPad > T520 serial: >Mobo: LENOVO model: 42433WG serial: UEFI [Legacy]: > LENOVO v: 8AET69WW (1.49 ) >date: 06/14/2018 > […] > CPU: Dual Core: Intel Core i5-2520M type: MT MCP speed: 2990 MHz > min/max: 800/3200 MHz > Graphics: Device-1: Intel 2nd Generation Core Processor Family > Integrated Graphics driver: i915 v: kernel >Display: x11 server: X.Org 1.20.1 driver: modesetting > resolution: 1920x1080~60Hz >OpenGL: renderer: Mesa DRI Intel Sandybridge Mobile v: 3.3 > Mesa 18.1.7 > […] > Info: Processes: 322 Uptime: 16m Memory: 15.45 GiB used: 3.12 GiB > (20.2%) Shell: zsh inxi: 3.0.22 > > Thanks, > Martin > > > breather after the merge window, and it takes a bit of time for bug > > reports to start coming in and get identified. Plus people were > > probably still on vacation (particularly Europe), and some people > > were at Open Source Summit NA last week too. Having a calm week was > > good. > > > > Regardless of the reason, it's pretty quiet/ The bulk of it is > > drivers (network and gpu stand out), with the rest being a random > > collection all over (arch/x86 and generic networking stands out, > > but there's misc stuff all over). > > > > Go out and test. > > > > Linus > > > > --- […] -- Martin
Re: [REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
This regression is gone with 4.19-rc8. Thanks, Martin Martin Steigerwald - 11.09.18, 09:53: […] > Linus Torvalds - 02.09.18, 23:45: > > As usual, the rc2 release is pretty small. People are taking a > > With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes hangs > with black screen when resuming from suspend or hibernation. With > 4.18.1 it did not. Of course there have been userspace related updates > that could be related. > > I currently have no time to dig into this and on this production > laptop I generally do not do bisects between major kernel releases. > So currently I only answer questions that do not require much time to > answer. > > For now I switched back to 4.18. If that is stable – and thus likely > no userspace component is related –, I go with 4.19-rc3 or whatever > is most recent version to see if the issue has been fixed already. > > % inxi -z -b -G > System:Host: […] Kernel: 4.18.1-tp520-btrfstrim x86_64 bits: 64 > Desktop: KDE Plasma 5.13.5 >Distro: Debian GNU/Linux buster/sid > Machine: Type: Laptop System: LENOVO product: 42433WG v: ThinkPad > T520 serial: >Mobo: LENOVO model: 42433WG serial: UEFI [Legacy]: > LENOVO v: 8AET69WW (1.49 ) >date: 06/14/2018 > […] > CPU: Dual Core: Intel Core i5-2520M type: MT MCP speed: 2990 MHz > min/max: 800/3200 MHz > Graphics: Device-1: Intel 2nd Generation Core Processor Family > Integrated Graphics driver: i915 v: kernel >Display: x11 server: X.Org 1.20.1 driver: modesetting > resolution: 1920x1080~60Hz >OpenGL: renderer: Mesa DRI Intel Sandybridge Mobile v: 3.3 > Mesa 18.1.7 > […] > Info: Processes: 322 Uptime: 16m Memory: 15.45 GiB used: 3.12 GiB > (20.2%) Shell: zsh inxi: 3.0.22 > > Thanks, > Martin > > > breather after the merge window, and it takes a bit of time for bug > > reports to start coming in and get identified. Plus people were > > probably still on vacation (particularly Europe), and some people > > were at Open Source Summit NA last week too. Having a calm week was > > good. > > > > Regardless of the reason, it's pretty quiet/ The bulk of it is > > drivers (network and gpu stand out), with the rest being a random > > collection all over (arch/x86 and generic networking stands out, > > but there's misc stuff all over). > > > > Go out and test. > > > > Linus > > > > --- […] -- Martin
Re: Linux 4.19-rc4 released, an apology, and a maintainership note
l...@lkcl.net - 30.09.18, 14:09: > > That written: Quite some of the rude mails that contained swearwords > > I read from you have been about code, not persons. I think this is > > an important distinction. I do not have much of an issue with > > swearing at code :), especially when it is in some humorous way. > > absolutely, and this is one thing that a lot of people are, sadly, > trained pretty much from birth to be incapable of understanding: > namely the difference between criticism of the PERSON and criticism > of the ACTION. > > (1) "YOU are bad! GO STAND IN THE NAUGHTY CORNER!" > (2) "That was a BAD thing to do!" > (3) "That hurt my feelings that you did that" > > the first is the way that poorly-trained parents and kindergarten > teachers talk to children. > > the second is... only marginally better, but it's a start > > the third is how UNICEF trains teachers to treat children as human > beings. During releasing a lot of limiting "stuff" I found that probably nothing written or said can hurt my feelings unless I let it do so or even… unless I choose (!) to feel hurt about it. So at times I am clear about this, I´d say: "I have chosen to feel hurt about what you did." However in this human experience a lot of people, including myself, still hold on to a lot of limiting "stuff" which invites feeling hurt. We, as humankind, have a history of hurting each other. During this releasing work I also learned about two key ingredients of successful relationships: Harmlessness and mutuality. I opted out of the hurting cycle as best I can. And so I choose to write in a way that moves around what from my own experience of feeling hurt I know could hurt others. I choose to write in a harmless way so to say. While still aiming to bring my point across. A very important ingredient for this is to write from my own experience. Of course others can feel hurt about something I would not feel hurt about and I may not be aware that the other might feel hurt about. That is why in such a case it is important to give and receive feedback. Still when writing from my own experience without saying that anything is wrong with the other, it appears to be unlikely to trigger hurt. That is at least my experience so far. Thanks, -- Martin
Re: Linux 4.19-rc4 released, an apology, and a maintainership note
l...@lkcl.net - 30.09.18, 14:09: > > That written: Quite some of the rude mails that contained swearwords > > I read from you have been about code, not persons. I think this is > > an important distinction. I do not have much of an issue with > > swearing at code :), especially when it is in some humorous way. > > absolutely, and this is one thing that a lot of people are, sadly, > trained pretty much from birth to be incapable of understanding: > namely the difference between criticism of the PERSON and criticism > of the ACTION. > > (1) "YOU are bad! GO STAND IN THE NAUGHTY CORNER!" > (2) "That was a BAD thing to do!" > (3) "That hurt my feelings that you did that" > > the first is the way that poorly-trained parents and kindergarten > teachers talk to children. > > the second is... only marginally better, but it's a start > > the third is how UNICEF trains teachers to treat children as human > beings. During releasing a lot of limiting "stuff" I found that probably nothing written or said can hurt my feelings unless I let it do so or even… unless I choose (!) to feel hurt about it. So at times I am clear about this, I´d say: "I have chosen to feel hurt about what you did." However in this human experience a lot of people, including myself, still hold on to a lot of limiting "stuff" which invites feeling hurt. We, as humankind, have a history of hurting each other. During this releasing work I also learned about two key ingredients of successful relationships: Harmlessness and mutuality. I opted out of the hurting cycle as best I can. And so I choose to write in a way that moves around what from my own experience of feeling hurt I know could hurt others. I choose to write in a harmless way so to say. While still aiming to bring my point across. A very important ingredient for this is to write from my own experience. Of course others can feel hurt about something I would not feel hurt about and I may not be aware that the other might feel hurt about. That is why in such a case it is important to give and receive feedback. Still when writing from my own experience without saying that anything is wrong with the other, it appears to be unlikely to trigger hurt. That is at least my experience so far. Thanks, -- Martin
Re: Code of Conduct: Let's revamp it.
Pavel Machek - 25.09.18, 15:28: > > > > > Your above argument that the Code of Conduct is problematic > > > > > because of who wrote it seems to contradict your statement > > > > > that we shall judge by code (or text) alone. > > > > > > > > I think there are important differences between code to be run > > > > by CPUs and a Code to be run by humans. And when the author > > > > goes on a victory lap on Twitter and declares the Code to be "a > > > > political document", is it any surprise I'm worried? > > > > > > Would you have link on that? > > > > The CoC is a political document: > > https://web.archive.org/web/20180924234027/https://twitter.com/coral > > ineada/status/1041465346656530432 > > > > Possible victory lap 1: > > https://web.archive.org/web/20180921104730/https://twitter.com/coral > > ineada/status/1041441155874009093 > > > > Possible victory lap 2: > > https://web.archive.org/web/20180920211406/https://twitter.com/coral > > ineada/status/1042249983590838272 > Thanks! > > I thought you was referring to this... http://archive.is/6nhps > ... which is somehow even more disturbing to me. That would be one of the main issues I see with that change: It did not went through the usual review process. I did not know the Contributor Covenant was driven by people with such a strong agenda. I still think that this newly adopted code of conduct document won´t kill Linux. As I have strong trust the community would redact or change that document if need be. I did not agree with the urgency behind the initial discussion especially as it was mostly initiated by who I´d consider by-standers, but I see benefit on carefully reviewing a code of conduct and I see that the hastily adopted Contributor Covenant may not be a good or the best choice. I still adhere to "take the teaching, not the teacher". I do not care what kind of person the author of CoC is. So I´d review whether the actual document contents are appropriate for the kernel community. I suggest reviewing the Code of Conducts of KDE¹ and Debian². Both projects seem to run pretty well with a Code of Conduct in place. While what happens regarding a document is always the choice of people, I think one of the most important aspects would be to make sure that the means of enforcement the code of conduct provides aligns with the highest good of the kernel community. Too strongly worded it opens up opportunities to abuse the code of conduct. Too weakly worded, it can render the code of conduct ineffective. I think some of the enforcement wording in Contributor Covenant is not helpful. I don´t think that "Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership." adds something useful to the code of conduct. One major question for me is: Is the code of conduct based on fear of being hurt or harassed or does it aim at a friendly and supportive community? I do not think that a fear based code of conduct is useful. There is already quite some harmful stuff going on in the world for the apparent sake of security (but in the real interest of exercising power over people). I think that is why I prefer wording of both Code of Conduct of KDE¹ and Debian² over the Contributor Covenant. I´d probably take more from those and less from Contributor Covenant. Anyway, I see myself only as a by-stander… so of course those who are in charge are of course free to take anything from this mail they think is useful and discard the rest. [1] https://www.kde.org/code-of-conduct/ Unlike noted here in the thread before, it does have a provision for leaders to enforce it: "Leaders of any group, such as moderators of mailing lists, IRC channels, forums, etc., will exercise the right to suspend access to any person who persistently breaks our shared Code of Conduct." But is has an important distinction in there: It is a *right*, not an *obligation*. [2] https://www.debian.org/code_of_conduct It also has a provision to enforce it: "Serious or persistent offenders will be temporarily or permanently banned from communicating through Debian's systems. Complaints should be made (in private) to the administrators of the Debian communication forum in question. To find contact information for these administrators, please see the page on Debian's organizational structure." Here is it written indirectly as an obligation. Thanks, -- Martin
Re: Code of Conduct: Let's revamp it.
Pavel Machek - 25.09.18, 15:28: > > > > > Your above argument that the Code of Conduct is problematic > > > > > because of who wrote it seems to contradict your statement > > > > > that we shall judge by code (or text) alone. > > > > > > > > I think there are important differences between code to be run > > > > by CPUs and a Code to be run by humans. And when the author > > > > goes on a victory lap on Twitter and declares the Code to be "a > > > > political document", is it any surprise I'm worried? > > > > > > Would you have link on that? > > > > The CoC is a political document: > > https://web.archive.org/web/20180924234027/https://twitter.com/coral > > ineada/status/1041465346656530432 > > > > Possible victory lap 1: > > https://web.archive.org/web/20180921104730/https://twitter.com/coral > > ineada/status/1041441155874009093 > > > > Possible victory lap 2: > > https://web.archive.org/web/20180920211406/https://twitter.com/coral > > ineada/status/1042249983590838272 > Thanks! > > I thought you was referring to this... http://archive.is/6nhps > ... which is somehow even more disturbing to me. That would be one of the main issues I see with that change: It did not went through the usual review process. I did not know the Contributor Covenant was driven by people with such a strong agenda. I still think that this newly adopted code of conduct document won´t kill Linux. As I have strong trust the community would redact or change that document if need be. I did not agree with the urgency behind the initial discussion especially as it was mostly initiated by who I´d consider by-standers, but I see benefit on carefully reviewing a code of conduct and I see that the hastily adopted Contributor Covenant may not be a good or the best choice. I still adhere to "take the teaching, not the teacher". I do not care what kind of person the author of CoC is. So I´d review whether the actual document contents are appropriate for the kernel community. I suggest reviewing the Code of Conducts of KDE¹ and Debian². Both projects seem to run pretty well with a Code of Conduct in place. While what happens regarding a document is always the choice of people, I think one of the most important aspects would be to make sure that the means of enforcement the code of conduct provides aligns with the highest good of the kernel community. Too strongly worded it opens up opportunities to abuse the code of conduct. Too weakly worded, it can render the code of conduct ineffective. I think some of the enforcement wording in Contributor Covenant is not helpful. I don´t think that "Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership." adds something useful to the code of conduct. One major question for me is: Is the code of conduct based on fear of being hurt or harassed or does it aim at a friendly and supportive community? I do not think that a fear based code of conduct is useful. There is already quite some harmful stuff going on in the world for the apparent sake of security (but in the real interest of exercising power over people). I think that is why I prefer wording of both Code of Conduct of KDE¹ and Debian² over the Contributor Covenant. I´d probably take more from those and less from Contributor Covenant. Anyway, I see myself only as a by-stander… so of course those who are in charge are of course free to take anything from this mail they think is useful and discard the rest. [1] https://www.kde.org/code-of-conduct/ Unlike noted here in the thread before, it does have a provision for leaders to enforce it: "Leaders of any group, such as moderators of mailing lists, IRC channels, forums, etc., will exercise the right to suspend access to any person who persistently breaks our shared Code of Conduct." But is has an important distinction in there: It is a *right*, not an *obligation*. [2] https://www.debian.org/code_of_conduct It also has a provision to enforce it: "Serious or persistent offenders will be temporarily or permanently banned from communicating through Debian's systems. Complaints should be made (in private) to the administrators of the Debian communication forum in question. To find contact information for these administrators, please see the page on Debian's organizational structure." Here is it written indirectly as an obligation. Thanks, -- Martin
Re: Code of Conduct: Let's revamp it.
Hello Christoph. Christoph Conrads - 20.09.18, 23:18: > The CoC is extremely ambiguously written for an enforceable document, > any behavior disliked by the maintainers can be punished, and the > level of naivete of the maintainers defending it is suprising for > such a far reaching document. For me the most important point is this: Let Linus have his own experience and insights. It is not up to me telling him that he might be making this all up or may be completely right in his assessment. I do not know how he got to that experience and insights and what talks in person may have contributed to it. And its frankly simply not my business. I just congratulated him for his insights and his courage to speak up like this, seeing the potential in it. Not my business is also the CoC Linux kernel developers and contributors may or may not give themselves. I am mostly a by-stander. Sure I test rc kernels and give (limited, as I usually do not bisect issues) feedback, report bugs. But that is about it. What I see here is that a lot of people who are not even contributing to the Linux kernel in a major way apparently want to make their opinion about Code of Conduct heard loudly. I ask myself: What the point of it? Apparently at least some of the major contributors to the Linux kernel see an issue with communication culture on this mailing list and elsewhere. Whether it has been a wise move to just change the CoC to a different text, I read some major contributors opposing this move … I am all for letting people who contribute significantly to the Linux kernel have their own experience and insights. It is simply not my business to interfere with whether they give themselves and the wider community a Code of Conduct and what would be the content of it. They do the work, one of them cares for the infrastructure that serves this mailing list. Even in case someone would now censor every post I do on LKML or even ban me from using it… I do not think it is to up to me to change or control that behavior. Sure, even small contributions count and I even have a tiny, little commit to kernel documentation, but still for me the major point is: Some of the major contributors apparently see that the way of communicating here and elsewhere sometimes (!) does not serve Linux kernel development and the community. By just continuing the way it is, it is unlikely to receive a different outcome. So it is important to change *something*. There is a kernel developer summit where they like to discuss exactly things like this. I do not see it up to me to try to control the outcome of that process. KDE.org has a code of conduct¹. While at the same time they really have a rather friendly and welcoming environment – if you ask me one of the most friendly and welcoming ones I have ever witnessed so far. I also still see honest discussions there where people share their point of view and agree to disagree. They are very productive as well. Plasma and KDE applications become better and more usable with every release – yes, Linus in case you did not decide not to read mails on this list for now, I won´t CC your address, KDE stuff is getting better and better. And they work on making the project even more welcoming for newcomers. I´d say I even found friends within that project. They may not even need the CoC, but I do not see it doing any harm either. I really don´t see the point of most of the discussion here. What happened now won´t be the end of Linux and that´s about it. There is no point for predicting doom unless you want it to happen. [1] https://www.kde.org/code-of-conduct/ Thanks, -- Martin
Re: Code of Conduct: Let's revamp it.
Hello Christoph. Christoph Conrads - 20.09.18, 23:18: > The CoC is extremely ambiguously written for an enforceable document, > any behavior disliked by the maintainers can be punished, and the > level of naivete of the maintainers defending it is suprising for > such a far reaching document. For me the most important point is this: Let Linus have his own experience and insights. It is not up to me telling him that he might be making this all up or may be completely right in his assessment. I do not know how he got to that experience and insights and what talks in person may have contributed to it. And its frankly simply not my business. I just congratulated him for his insights and his courage to speak up like this, seeing the potential in it. Not my business is also the CoC Linux kernel developers and contributors may or may not give themselves. I am mostly a by-stander. Sure I test rc kernels and give (limited, as I usually do not bisect issues) feedback, report bugs. But that is about it. What I see here is that a lot of people who are not even contributing to the Linux kernel in a major way apparently want to make their opinion about Code of Conduct heard loudly. I ask myself: What the point of it? Apparently at least some of the major contributors to the Linux kernel see an issue with communication culture on this mailing list and elsewhere. Whether it has been a wise move to just change the CoC to a different text, I read some major contributors opposing this move … I am all for letting people who contribute significantly to the Linux kernel have their own experience and insights. It is simply not my business to interfere with whether they give themselves and the wider community a Code of Conduct and what would be the content of it. They do the work, one of them cares for the infrastructure that serves this mailing list. Even in case someone would now censor every post I do on LKML or even ban me from using it… I do not think it is to up to me to change or control that behavior. Sure, even small contributions count and I even have a tiny, little commit to kernel documentation, but still for me the major point is: Some of the major contributors apparently see that the way of communicating here and elsewhere sometimes (!) does not serve Linux kernel development and the community. By just continuing the way it is, it is unlikely to receive a different outcome. So it is important to change *something*. There is a kernel developer summit where they like to discuss exactly things like this. I do not see it up to me to try to control the outcome of that process. KDE.org has a code of conduct¹. While at the same time they really have a rather friendly and welcoming environment – if you ask me one of the most friendly and welcoming ones I have ever witnessed so far. I also still see honest discussions there where people share their point of view and agree to disagree. They are very productive as well. Plasma and KDE applications become better and more usable with every release – yes, Linus in case you did not decide not to read mails on this list for now, I won´t CC your address, KDE stuff is getting better and better. And they work on making the project even more welcoming for newcomers. I´d say I even found friends within that project. They may not even need the CoC, but I do not see it doing any harm either. I really don´t see the point of most of the discussion here. What happened now won´t be the end of Linux and that´s about it. There is no point for predicting doom unless you want it to happen. [1] https://www.kde.org/code-of-conduct/ Thanks, -- Martin
Re: [PATCH security-next v2 00/26] LSM: Explict LSM ordering
Kees Cook - 20.09.18, 18:23: > v2: > - add "lsm.order=" and CONFIG_LSM_ORDER instead of overloading > "security=" - reorganize introduction of ordering logic code > > Updated cover letter: > > This refactors the LSM registration and initialization infrastructure > to more centrally support different LSM types. What was considered a > "major" LSM is kept for legacy use of the "security=" boot parameter, > and now overlaps with the new class of "exclusive" LSMs for the future > blob sharing (to be added later). The "minor" LSMs become more well > defined as a result of the refactoring. > > Instead of continuing to (somewhat improperly) overload the kernel's > initcall system, this changes the LSM infrastructure to store a > registration structure (struct lsm_info) table instead, where metadata > about each LSM can be recorded (name, flags, order, enable flag, init > function). This can be extended in the future to include things like > required blob size for the coming "blob sharing" LSMs. I read the cover letter and still don´t know what this is about. Now I am certainly not engaged deeply with LSM. I bet my main missing piece is: What is a "blob sharing" LSM. I think it would improve the cover letter greatly if it explains briefly what is a major LSM, what is a minor LSM and what is a "blob sharing" LSM. Why those are all needed? What is the actual security or end user benefit of this work? The questions are not to question your work. I bet it makes all perfect sense. I just did not understand its sense from reading the cover letter. > The "major" LSMs had to individually negotiate which of them should be > enabled. This didn't provide a way to negotiate combinations of other > LSMs (as will be needed for "blob sharing" LSMs). This is solved by > providing the LSM infrastructure with all the details needed to make > the choice (exposing the per-LSM "enabled" flag, if used, the LSM > characteristics, and ordering expectations). > > As a result of the refactoring, the "minor" LSMs are able to remove > the open-coded security_add_hooks() calls for "capability", "yama", > and "loadpin", and to redefine "integrity" properly as a general LSM. > (Note that "integrity" actually defined _no_ hooks, but needs the > early initialization). > > With all LSMs being proessed centrally, it was possible to implement > a new boot parameter "lsm.order=" to provide explicit ordering, which > is helpful for the future "blob sharing" LSMs. Matching this is the > new CONFIG_LSM_ORDER, which replaces CONFIG_DEFAULT_SECURITY, as it > provides a higher granularity of control. > > To better show LSMs activation some debug reporting was added (enabled > with the "lsm.debug" boot commandline option). > > Finally, I added a WARN() around LSM initialization failures, which > appear to have always been silently ignored. (Realistically any LSM > init failures would have only been due to catastrophic kernel issues > that would render a system unworkable anyway, but it'd be better to > expose the problem as early as possible.) > > -Kees > > Kees Cook (26): > LSM: Correctly announce start of LSM initialization > vmlinux.lds.h: Avoid copy/paste of security_init section > LSM: Rename .security_initcall section to .lsm_info > LSM: Remove initcall tracing > LSM: Convert from initcall to struct lsm_info > vmlinux.lds.h: Move LSM_TABLE into INIT_DATA > LSM: Convert security_initcall() into DEFINE_LSM() > LSM: Record LSM name in struct lsm_info > LSM: Provide init debugging infrastructure > LSM: Don't ignore initialization failures > LSM: Introduce LSM_FLAG_LEGACY_MAJOR > LSM: Provide separate ordered initialization > LSM: Plumb visibility into optional "enabled" state > LSM: Lift LSM selection out of individual LSMs > LSM: Introduce lsm.enable= and lsm.disable= > LSM: Prepare for reorganizing "security=" logic > LSM: Refactor "security=" in terms of enable/disable > LSM: Build ordered list of ordered LSMs for init > LSM: Introduce CONFIG_LSM_ORDER > LSM: Introduce "lsm.order=" for boottime ordering > LoadPin: Initialize as ordered LSM > Yama: Initialize as ordered LSM > LSM: Introduce enum lsm_order > capability: Mark as LSM_ORDER_FIRST > LSM: Separate idea of "major" LSM from "exclusive" LSM > LSM: Add all exclusive LSMs to ordered initialization > > .../admin-guide/kernel-parameters.txt | 7 + > arch/arc/kernel/vmlinux.lds.S | 1 - > arch/arm/kernel/vmlinux-xip.lds.S | 1 - > arch/arm64/kernel/vmlinux.lds.S | 1 - > arch/h8300/kernel/vmlinux.lds.S | 1 - > arch/microblaze/kernel/vmlinux.lds.S | 2 - > arch/powerpc/kernel/vmlinux.lds.S | 2 - > arch/um/include/asm/common.lds.S | 2 - > arch/xtensa/kernel/vmlinux.lds.S | 1 - > include/asm-generic/vmlinux.lds.h | 25 +- > include/linux/init.h | 2 - >
Re: [PATCH security-next v2 00/26] LSM: Explict LSM ordering
Kees Cook - 20.09.18, 18:23: > v2: > - add "lsm.order=" and CONFIG_LSM_ORDER instead of overloading > "security=" - reorganize introduction of ordering logic code > > Updated cover letter: > > This refactors the LSM registration and initialization infrastructure > to more centrally support different LSM types. What was considered a > "major" LSM is kept for legacy use of the "security=" boot parameter, > and now overlaps with the new class of "exclusive" LSMs for the future > blob sharing (to be added later). The "minor" LSMs become more well > defined as a result of the refactoring. > > Instead of continuing to (somewhat improperly) overload the kernel's > initcall system, this changes the LSM infrastructure to store a > registration structure (struct lsm_info) table instead, where metadata > about each LSM can be recorded (name, flags, order, enable flag, init > function). This can be extended in the future to include things like > required blob size for the coming "blob sharing" LSMs. I read the cover letter and still don´t know what this is about. Now I am certainly not engaged deeply with LSM. I bet my main missing piece is: What is a "blob sharing" LSM. I think it would improve the cover letter greatly if it explains briefly what is a major LSM, what is a minor LSM and what is a "blob sharing" LSM. Why those are all needed? What is the actual security or end user benefit of this work? The questions are not to question your work. I bet it makes all perfect sense. I just did not understand its sense from reading the cover letter. > The "major" LSMs had to individually negotiate which of them should be > enabled. This didn't provide a way to negotiate combinations of other > LSMs (as will be needed for "blob sharing" LSMs). This is solved by > providing the LSM infrastructure with all the details needed to make > the choice (exposing the per-LSM "enabled" flag, if used, the LSM > characteristics, and ordering expectations). > > As a result of the refactoring, the "minor" LSMs are able to remove > the open-coded security_add_hooks() calls for "capability", "yama", > and "loadpin", and to redefine "integrity" properly as a general LSM. > (Note that "integrity" actually defined _no_ hooks, but needs the > early initialization). > > With all LSMs being proessed centrally, it was possible to implement > a new boot parameter "lsm.order=" to provide explicit ordering, which > is helpful for the future "blob sharing" LSMs. Matching this is the > new CONFIG_LSM_ORDER, which replaces CONFIG_DEFAULT_SECURITY, as it > provides a higher granularity of control. > > To better show LSMs activation some debug reporting was added (enabled > with the "lsm.debug" boot commandline option). > > Finally, I added a WARN() around LSM initialization failures, which > appear to have always been silently ignored. (Realistically any LSM > init failures would have only been due to catastrophic kernel issues > that would render a system unworkable anyway, but it'd be better to > expose the problem as early as possible.) > > -Kees > > Kees Cook (26): > LSM: Correctly announce start of LSM initialization > vmlinux.lds.h: Avoid copy/paste of security_init section > LSM: Rename .security_initcall section to .lsm_info > LSM: Remove initcall tracing > LSM: Convert from initcall to struct lsm_info > vmlinux.lds.h: Move LSM_TABLE into INIT_DATA > LSM: Convert security_initcall() into DEFINE_LSM() > LSM: Record LSM name in struct lsm_info > LSM: Provide init debugging infrastructure > LSM: Don't ignore initialization failures > LSM: Introduce LSM_FLAG_LEGACY_MAJOR > LSM: Provide separate ordered initialization > LSM: Plumb visibility into optional "enabled" state > LSM: Lift LSM selection out of individual LSMs > LSM: Introduce lsm.enable= and lsm.disable= > LSM: Prepare for reorganizing "security=" logic > LSM: Refactor "security=" in terms of enable/disable > LSM: Build ordered list of ordered LSMs for init > LSM: Introduce CONFIG_LSM_ORDER > LSM: Introduce "lsm.order=" for boottime ordering > LoadPin: Initialize as ordered LSM > Yama: Initialize as ordered LSM > LSM: Introduce enum lsm_order > capability: Mark as LSM_ORDER_FIRST > LSM: Separate idea of "major" LSM from "exclusive" LSM > LSM: Add all exclusive LSMs to ordered initialization > > .../admin-guide/kernel-parameters.txt | 7 + > arch/arc/kernel/vmlinux.lds.S | 1 - > arch/arm/kernel/vmlinux-xip.lds.S | 1 - > arch/arm64/kernel/vmlinux.lds.S | 1 - > arch/h8300/kernel/vmlinux.lds.S | 1 - > arch/microblaze/kernel/vmlinux.lds.S | 2 - > arch/powerpc/kernel/vmlinux.lds.S | 2 - > arch/um/include/asm/common.lds.S | 2 - > arch/xtensa/kernel/vmlinux.lds.S | 1 - > include/asm-generic/vmlinux.lds.h | 25 +- > include/linux/init.h | 2 - >
Re: […] an apology, and a maintainership note
Martin Steigerwald - 17.09.18, 09:57: > Dear Linus. > > Linus Torvalds - 16.09.18, 21:22: > > This is my reality. I am not an emotionally empathetic kind of > > person and that probably doesn't come as a big surprise to anybody. > > Least of all me. The fact that I then misread people and don't > > realize (for years) how badly I've judged a situation and > > contributed to an unprofessional environment is not good. > > > > This week people in our community confronted me about my lifetime of > > not understanding emotions. My flippant attacks in emails have been > > both unprofessional and uncalled for. Especially at times when I > > made it personal. In my quest for a better patch, this made sense > > to me. I know now this was not OK and I am truly sorry. > > > > The above is basically a long-winded way to get to the somewhat > > painful personal admission that hey, I need to change some of my > > behavior, and I want to apologize to the people that my personal > > behavior hurt and possibly drove away from kernel development > > entirely. > > I applaud you for the courage to go the bold step you have gone with > this mail. I can imagine coming up with this mail has been challenging > for you. > > Your step provides a big chance for a shift to happen towards a more > welcoming and friendly Linux kernel community. From what I saw here as > mostly someone who tests rc kernels and as mostly a by-stander of > kernel development you may not be the only one here having challenges > to deal with emotions. That written: Quite some of the rude mails that contained swearwords I read from you have been about code, not persons. I think this is an important distinction. I do not have much of an issue with swearing at code :), especially when it is in some humorous way. Code quality indeed is important. As are human interactions. -- Martin
Re: […] an apology, and a maintainership note
Martin Steigerwald - 17.09.18, 09:57: > Dear Linus. > > Linus Torvalds - 16.09.18, 21:22: > > This is my reality. I am not an emotionally empathetic kind of > > person and that probably doesn't come as a big surprise to anybody. > > Least of all me. The fact that I then misread people and don't > > realize (for years) how badly I've judged a situation and > > contributed to an unprofessional environment is not good. > > > > This week people in our community confronted me about my lifetime of > > not understanding emotions. My flippant attacks in emails have been > > both unprofessional and uncalled for. Especially at times when I > > made it personal. In my quest for a better patch, this made sense > > to me. I know now this was not OK and I am truly sorry. > > > > The above is basically a long-winded way to get to the somewhat > > painful personal admission that hey, I need to change some of my > > behavior, and I want to apologize to the people that my personal > > behavior hurt and possibly drove away from kernel development > > entirely. > > I applaud you for the courage to go the bold step you have gone with > this mail. I can imagine coming up with this mail has been challenging > for you. > > Your step provides a big chance for a shift to happen towards a more > welcoming and friendly Linux kernel community. From what I saw here as > mostly someone who tests rc kernels and as mostly a by-stander of > kernel development you may not be the only one here having challenges > to deal with emotions. That written: Quite some of the rude mails that contained swearwords I read from you have been about code, not persons. I think this is an important distinction. I do not have much of an issue with swearing at code :), especially when it is in some humorous way. Code quality indeed is important. As are human interactions. -- Martin
Re: […] an apology, and a maintainership note
Dear Linus. Linus Torvalds - 16.09.18, 21:22: > This is my reality. I am not an emotionally empathetic kind of person > and that probably doesn't come as a big surprise to anybody. Least > of all me. The fact that I then misread people and don't realize > (for years) how badly I've judged a situation and contributed to an > unprofessional environment is not good. > > This week people in our community confronted me about my lifetime of > not understanding emotions. My flippant attacks in emails have been > both unprofessional and uncalled for. Especially at times when I made > it personal. In my quest for a better patch, this made sense to me. > I know now this was not OK and I am truly sorry. > > The above is basically a long-winded way to get to the somewhat > painful personal admission that hey, I need to change some of my > behavior, and I want to apologize to the people that my personal > behavior hurt and possibly drove away from kernel development > entirely. I applaud you for the courage to go the bold step you have gone with this mail. I can imagine coming up with this mail has been challenging for you. Your step provides a big chance for a shift to happen towards a more welcoming and friendly Linux kernel community. From what I saw here as mostly someone who tests rc kernels and as mostly a by-stander of kernel development you may not be the only one here having challenges to deal with emotions. I once learned that there may be two types of personality, one who dives deeply into emotions and one who does not. Two types of personality who often have challenges to understand each other. I believe that people of those two types of personality can learn from each other. It is important to move beyond right and wrong or good and bad in this. Whenever I act, I receive feedback (even the lack of feedback is a feedback). Do I like this feedback? Or do I like to create a different result? If I like to create a different result, its important to act differently, as its unlikely that the same behavior will create a different result. Thank you, Linus. -- Martin
Re: […] an apology, and a maintainership note
Dear Linus. Linus Torvalds - 16.09.18, 21:22: > This is my reality. I am not an emotionally empathetic kind of person > and that probably doesn't come as a big surprise to anybody. Least > of all me. The fact that I then misread people and don't realize > (for years) how badly I've judged a situation and contributed to an > unprofessional environment is not good. > > This week people in our community confronted me about my lifetime of > not understanding emotions. My flippant attacks in emails have been > both unprofessional and uncalled for. Especially at times when I made > it personal. In my quest for a better patch, this made sense to me. > I know now this was not OK and I am truly sorry. > > The above is basically a long-winded way to get to the somewhat > painful personal admission that hey, I need to change some of my > behavior, and I want to apologize to the people that my personal > behavior hurt and possibly drove away from kernel development > entirely. I applaud you for the courage to go the bold step you have gone with this mail. I can imagine coming up with this mail has been challenging for you. Your step provides a big chance for a shift to happen towards a more welcoming and friendly Linux kernel community. From what I saw here as mostly someone who tests rc kernels and as mostly a by-stander of kernel development you may not be the only one here having challenges to deal with emotions. I once learned that there may be two types of personality, one who dives deeply into emotions and one who does not. Two types of personality who often have challenges to understand each other. I believe that people of those two types of personality can learn from each other. It is important to move beyond right and wrong or good and bad in this. Whenever I act, I receive feedback (even the lack of feedback is a feedback). Do I like this feedback? Or do I like to create a different result? If I like to create a different result, its important to act differently, as its unlikely that the same behavior will create a different result. Thank you, Linus. -- Martin
Re: [Intel-gfx] [REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
Ville Syrjälä - 12.09.18, 19:10: > On Tue, Sep 11, 2018 at 12:17:05PM +0200, Martin Steigerwald wrote: > > Cc´d Intel Gfx mailing list, in case somebody there knows something: > > > > Cc´d Thorsten for regression tracking… forgot initially. Can also > > open bug report at a later time but so far I cannot provide many > > details about the issue. > > > > Rafael J. Wysocki - 11.09.18, 10:17: > > > On Tue, Sep 11, 2018 at 10:01 AM Martin Steigerwald > > > > wrote: > > > > Hi. > > > > > > > > Linus Torvalds - 02.09.18, 23:45: > > > > > As usual, the rc2 release is pretty small. People are taking a > > > > > > > > With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes > > > > hangs > > > > with black screen when resuming from suspend or hibernation. > > > > With > > > > 4.18.1 it did not. Of course there have been userspace related > > > > updates that could be related. > > > > > > > > I currently have no time to dig into this and on this production > > > > laptop I generally do not do bisects between major kernel > > > > releases. > > > > So currently I only answer questions that do not require much > > > > time > > > > to answer. > > > > > > > > For now I switched back to 4.18. If that is stable – and thus > > > > likely > > > > no userspace component is related –, I go with 4.19-rc3 or > > > > whatever > > > > is most recent version to see if the issue has been fixed > > > > already. > > > > > > There were almost no general changes related to system-wide PM > > > between 4.18 and current, so I would suspect one of the device > > > drivers or the x86 core. It also may be something like CPU > > > online/offline, however.> > > I see. I wondered about intel-gfx driver already. Of course it could > > also be something else. > > > > I forgot to mention: The mouse pointer was visible, but the screen > > remained black. > > Did the mouse cursor still move or not? No, it did not move. > Could also try suspend without any GUI stuff in the way. Or try the > intel ddx instead of the modesetting ddx (assuming that's what > you're using now) and no compositor to rule out GPU hangs killing > the compositor. The intel ddx can also deal with the GPU not > recovering from a hang by switching to software rendering, > whereas modesetting cannot. Thanks for these suggestions. Currently laptop is still on 4.18 again (4.18.7) and I did not see this hang after resume so far. If it continues to be stable for a few more days, I try with latest 4.19 again as then its very likely kernel related. > Hmm. Also T520 is an optimus laptop maybe? If there's an nvidia > GPU involved it's going to be hard to get anyone to care. Better > switch that off in the BIOS if you haven't already. I decided back then for Intel only graphics. I never regretted this. For me NVidia graphics is not an option, unless NVidia significantly changes their policy regarding free software drivers. Thanks, -- Martin
Re: [Intel-gfx] [REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
Ville Syrjälä - 12.09.18, 19:10: > On Tue, Sep 11, 2018 at 12:17:05PM +0200, Martin Steigerwald wrote: > > Cc´d Intel Gfx mailing list, in case somebody there knows something: > > > > Cc´d Thorsten for regression tracking… forgot initially. Can also > > open bug report at a later time but so far I cannot provide many > > details about the issue. > > > > Rafael J. Wysocki - 11.09.18, 10:17: > > > On Tue, Sep 11, 2018 at 10:01 AM Martin Steigerwald > > > > wrote: > > > > Hi. > > > > > > > > Linus Torvalds - 02.09.18, 23:45: > > > > > As usual, the rc2 release is pretty small. People are taking a > > > > > > > > With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes > > > > hangs > > > > with black screen when resuming from suspend or hibernation. > > > > With > > > > 4.18.1 it did not. Of course there have been userspace related > > > > updates that could be related. > > > > > > > > I currently have no time to dig into this and on this production > > > > laptop I generally do not do bisects between major kernel > > > > releases. > > > > So currently I only answer questions that do not require much > > > > time > > > > to answer. > > > > > > > > For now I switched back to 4.18. If that is stable – and thus > > > > likely > > > > no userspace component is related –, I go with 4.19-rc3 or > > > > whatever > > > > is most recent version to see if the issue has been fixed > > > > already. > > > > > > There were almost no general changes related to system-wide PM > > > between 4.18 and current, so I would suspect one of the device > > > drivers or the x86 core. It also may be something like CPU > > > online/offline, however.> > > I see. I wondered about intel-gfx driver already. Of course it could > > also be something else. > > > > I forgot to mention: The mouse pointer was visible, but the screen > > remained black. > > Did the mouse cursor still move or not? No, it did not move. > Could also try suspend without any GUI stuff in the way. Or try the > intel ddx instead of the modesetting ddx (assuming that's what > you're using now) and no compositor to rule out GPU hangs killing > the compositor. The intel ddx can also deal with the GPU not > recovering from a hang by switching to software rendering, > whereas modesetting cannot. Thanks for these suggestions. Currently laptop is still on 4.18 again (4.18.7) and I did not see this hang after resume so far. If it continues to be stable for a few more days, I try with latest 4.19 again as then its very likely kernel related. > Hmm. Also T520 is an optimus laptop maybe? If there's an nvidia > GPU involved it's going to be hard to get anyone to care. Better > switch that off in the BIOS if you haven't already. I decided back then for Intel only graphics. I never regretted this. For me NVidia graphics is not an option, unless NVidia significantly changes their policy regarding free software drivers. Thanks, -- Martin
Re: [REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
Cc´d Intel Gfx mailing list, in case somebody there knows something: Cc´d Thorsten for regression tracking… forgot initially. Can also open bug report at a later time but so far I cannot provide many details about the issue. Rafael J. Wysocki - 11.09.18, 10:17: > On Tue, Sep 11, 2018 at 10:01 AM Martin Steigerwald wrote: > > Hi. > > > > Linus Torvalds - 02.09.18, 23:45: > > > As usual, the rc2 release is pretty small. People are taking a > > > > With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes hangs > > with black screen when resuming from suspend or hibernation. With > > 4.18.1 it did not. Of course there have been userspace related > > updates that could be related. > > > > I currently have no time to dig into this and on this production > > laptop I generally do not do bisects between major kernel releases. > > So currently I only answer questions that do not require much time > > to answer. > > > > For now I switched back to 4.18. If that is stable – and thus likely > > no userspace component is related –, I go with 4.19-rc3 or whatever > > is most recent version to see if the issue has been fixed already. > > There were almost no general changes related to system-wide PM between > 4.18 and current, so I would suspect one of the device drivers or the > x86 core. It also may be something like CPU online/offline, however. I see. I wondered about intel-gfx driver already. Of course it could also be something else. I forgot to mention: The mouse pointer was visible, but the screen remained black. That may again point away from Intel gfx driver. There has been a MESA update in between in userspace. Currently running 4.18.7 to make sure it is no userspace issue. Thanks, -- Martin
Re: [REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
Cc´d Intel Gfx mailing list, in case somebody there knows something: Cc´d Thorsten for regression tracking… forgot initially. Can also open bug report at a later time but so far I cannot provide many details about the issue. Rafael J. Wysocki - 11.09.18, 10:17: > On Tue, Sep 11, 2018 at 10:01 AM Martin Steigerwald wrote: > > Hi. > > > > Linus Torvalds - 02.09.18, 23:45: > > > As usual, the rc2 release is pretty small. People are taking a > > > > With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes hangs > > with black screen when resuming from suspend or hibernation. With > > 4.18.1 it did not. Of course there have been userspace related > > updates that could be related. > > > > I currently have no time to dig into this and on this production > > laptop I generally do not do bisects between major kernel releases. > > So currently I only answer questions that do not require much time > > to answer. > > > > For now I switched back to 4.18. If that is stable – and thus likely > > no userspace component is related –, I go with 4.19-rc3 or whatever > > is most recent version to see if the issue has been fixed already. > > There were almost no general changes related to system-wide PM between > 4.18 and current, so I would suspect one of the device drivers or the > x86 core. It also may be something like CPU online/offline, however. I see. I wondered about intel-gfx driver already. Of course it could also be something else. I forgot to mention: The mouse pointer was visible, but the screen remained black. That may again point away from Intel gfx driver. There has been a MESA update in between in userspace. Currently running 4.18.7 to make sure it is no userspace issue. Thanks, -- Martin
[REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
Hi. Linus Torvalds - 02.09.18, 23:45: > As usual, the rc2 release is pretty small. People are taking a With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes hangs with black screen when resuming from suspend or hibernation. With 4.18.1 it did not. Of course there have been userspace related updates that could be related. I currently have no time to dig into this and on this production laptop I generally do not do bisects between major kernel releases. So currently I only answer questions that do not require much time to answer. For now I switched back to 4.18. If that is stable – and thus likely no userspace component is related –, I go with 4.19-rc3 or whatever is most recent version to see if the issue has been fixed already. % inxi -z -b -G System:Host: […] Kernel: 4.18.1-tp520-btrfstrim x86_64 bits: 64 Desktop: KDE Plasma 5.13.5 Distro: Debian GNU/Linux buster/sid Machine: Type: Laptop System: LENOVO product: 42433WG v: ThinkPad T520 serial: Mobo: LENOVO model: 42433WG serial: UEFI [Legacy]: LENOVO v: 8AET69WW (1.49 ) date: 06/14/2018 […] CPU: Dual Core: Intel Core i5-2520M type: MT MCP speed: 2990 MHz min/max: 800/3200 MHz Graphics: Device-1: Intel 2nd Generation Core Processor Family Integrated Graphics driver: i915 v: kernel Display: x11 server: X.Org 1.20.1 driver: modesetting resolution: 1920x1080~60Hz OpenGL: renderer: Mesa DRI Intel Sandybridge Mobile v: 3.3 Mesa 18.1.7 […] Info: Processes: 322 Uptime: 16m Memory: 15.45 GiB used: 3.12 GiB (20.2%) Shell: zsh inxi: 3.0.22 Thanks, Martin > breather after the merge window, and it takes a bit of time for bug > reports to start coming in and get identified. Plus people were > probably still on vacation (particularly Europe), and some people were > at Open Source Summit NA last week too. Having a calm week was good. > > Regardless of the reason, it's pretty quiet/ The bulk of it is drivers > (network and gpu stand out), with the rest being a random collection > all over (arch/x86 and generic networking stands out, but there's > misc stuff all over). > > Go out and test. > > Linus > > --- > > Adrian Hunter (1): > mmc: block: Fix unsupported parallel dispatch of requests > > Ahmad Fatoum (1): > net: macb: Fix regression breaking non-MDIO fixed-link PHYs > > Akshu Agrawal (1): > clk: x86: Set default parent to 48Mhz > > Andi Kleen (2): > x86/spectre: Add missing family 6 check to microcode check > x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+ > > Andrey Grodzovsky (1): > drm/amdgpu: Fix page fault and kasan warning on pci device > remove. > > Andy Lutomirski (1): > x86/nmi: Fix NMI uaccess race against CR3 switching > > Anirudh Venkataramanan (5): > ice: Fix multiple static analyser warnings > ice: Cleanup magic number > ice: Fix bugs in control queue processing > ice: Fix a few null pointer dereference issues > ice: Trivial formatting fixes > > Anson Huang (1): > thermal: of-thermal: disable passive polling when thermal zone > is disabled > > Anssi Hannula (1): > net: macb: do not disable MDIO bus at open/close time > > Ard Biesheuvel (3): > crypto: arm64/sm4-ce - check for the right CPU feature bit > crypto: arm64/aes-gcm-ce - fix scatterwalk API violation > powerpc: disable support for relative ksymtab references > > Arnd Bergmann (1): > net_sched: fix unused variable warning in stmmac > > Ben Hutchings (1): > x86: Allow generating user-space headers without a compiler > > Bo Chen (2): > e1000: check on netif_running() before calling e1000_up() > e1000: ensure to free old tx/rx rings in set_ringparam() > > Brett Creeley (1): > ice: Set VLAN flags correctly > > Bruce Allan (3): > ice: Remove unnecessary node owner check > ice: Update to interrupts enabled in OICR > ice: Change struct members from bool to u8 > > Chaitanya Kulkarni (1): > nvmet: free workqueue object if module init fails > > Chengguang Xu (1): > block: remove unnecessary condition check > > Chris Wilson (2): > drm/i915: Stop holding a ref to the ppgtt from each vma > drm/i915/audio: Hook up component bindings even if displays are > disabled > > Christian König (3): > drm/amdgpu: fix VM clearing for the root PD > drm/amdgpu: fix preamble handling > drm/amdgpu: fix holding mn_lock while allocating memory > > Colin Ian King (2): > qed: fix spelling mistake "comparsion" -> "comparison" > x86/xen: remove redundant variable save_pud > > Cong Wang (10): > net_sched: improve and refactor tcf_action_put_many() > net_sched: remove unnecessary ops->delete() > net_sched: remove unused parameter for tcf_action_delete() > net_sched: remove unused tcf_idr_check() > net_sched: remove list_head from
[REGRESSION 4.19-rc2] sometimes hangs with black screen when resuming from suspend or hibernation (was: Re: Linux 4.19-rc2)
Hi. Linus Torvalds - 02.09.18, 23:45: > As usual, the rc2 release is pretty small. People are taking a With 4.19-rc2 this ThinkPad T520 with i5 Sandybrdige sometimes hangs with black screen when resuming from suspend or hibernation. With 4.18.1 it did not. Of course there have been userspace related updates that could be related. I currently have no time to dig into this and on this production laptop I generally do not do bisects between major kernel releases. So currently I only answer questions that do not require much time to answer. For now I switched back to 4.18. If that is stable – and thus likely no userspace component is related –, I go with 4.19-rc3 or whatever is most recent version to see if the issue has been fixed already. % inxi -z -b -G System:Host: […] Kernel: 4.18.1-tp520-btrfstrim x86_64 bits: 64 Desktop: KDE Plasma 5.13.5 Distro: Debian GNU/Linux buster/sid Machine: Type: Laptop System: LENOVO product: 42433WG v: ThinkPad T520 serial: Mobo: LENOVO model: 42433WG serial: UEFI [Legacy]: LENOVO v: 8AET69WW (1.49 ) date: 06/14/2018 […] CPU: Dual Core: Intel Core i5-2520M type: MT MCP speed: 2990 MHz min/max: 800/3200 MHz Graphics: Device-1: Intel 2nd Generation Core Processor Family Integrated Graphics driver: i915 v: kernel Display: x11 server: X.Org 1.20.1 driver: modesetting resolution: 1920x1080~60Hz OpenGL: renderer: Mesa DRI Intel Sandybridge Mobile v: 3.3 Mesa 18.1.7 […] Info: Processes: 322 Uptime: 16m Memory: 15.45 GiB used: 3.12 GiB (20.2%) Shell: zsh inxi: 3.0.22 Thanks, Martin > breather after the merge window, and it takes a bit of time for bug > reports to start coming in and get identified. Plus people were > probably still on vacation (particularly Europe), and some people were > at Open Source Summit NA last week too. Having a calm week was good. > > Regardless of the reason, it's pretty quiet/ The bulk of it is drivers > (network and gpu stand out), with the rest being a random collection > all over (arch/x86 and generic networking stands out, but there's > misc stuff all over). > > Go out and test. > > Linus > > --- > > Adrian Hunter (1): > mmc: block: Fix unsupported parallel dispatch of requests > > Ahmad Fatoum (1): > net: macb: Fix regression breaking non-MDIO fixed-link PHYs > > Akshu Agrawal (1): > clk: x86: Set default parent to 48Mhz > > Andi Kleen (2): > x86/spectre: Add missing family 6 check to microcode check > x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+ > > Andrey Grodzovsky (1): > drm/amdgpu: Fix page fault and kasan warning on pci device > remove. > > Andy Lutomirski (1): > x86/nmi: Fix NMI uaccess race against CR3 switching > > Anirudh Venkataramanan (5): > ice: Fix multiple static analyser warnings > ice: Cleanup magic number > ice: Fix bugs in control queue processing > ice: Fix a few null pointer dereference issues > ice: Trivial formatting fixes > > Anson Huang (1): > thermal: of-thermal: disable passive polling when thermal zone > is disabled > > Anssi Hannula (1): > net: macb: do not disable MDIO bus at open/close time > > Ard Biesheuvel (3): > crypto: arm64/sm4-ce - check for the right CPU feature bit > crypto: arm64/aes-gcm-ce - fix scatterwalk API violation > powerpc: disable support for relative ksymtab references > > Arnd Bergmann (1): > net_sched: fix unused variable warning in stmmac > > Ben Hutchings (1): > x86: Allow generating user-space headers without a compiler > > Bo Chen (2): > e1000: check on netif_running() before calling e1000_up() > e1000: ensure to free old tx/rx rings in set_ringparam() > > Brett Creeley (1): > ice: Set VLAN flags correctly > > Bruce Allan (3): > ice: Remove unnecessary node owner check > ice: Update to interrupts enabled in OICR > ice: Change struct members from bool to u8 > > Chaitanya Kulkarni (1): > nvmet: free workqueue object if module init fails > > Chengguang Xu (1): > block: remove unnecessary condition check > > Chris Wilson (2): > drm/i915: Stop holding a ref to the ppgtt from each vma > drm/i915/audio: Hook up component bindings even if displays are > disabled > > Christian König (3): > drm/amdgpu: fix VM clearing for the root PD > drm/amdgpu: fix preamble handling > drm/amdgpu: fix holding mn_lock while allocating memory > > Colin Ian King (2): > qed: fix spelling mistake "comparsion" -> "comparison" > x86/xen: remove redundant variable save_pud > > Cong Wang (10): > net_sched: improve and refactor tcf_action_put_many() > net_sched: remove unnecessary ops->delete() > net_sched: remove unused parameter for tcf_action_delete() > net_sched: remove unused tcf_idr_check() > net_sched: remove list_head from
Re: POSIX violation by writeback error
Rogier Wolff - 05.09.18, 10:04: > On Wed, Sep 05, 2018 at 09:39:58AM +0200, Martin Steigerwald wrote: > > Rogier Wolff - 05.09.18, 09:08: > > > So when a mail queuer puts mail the mailq files and the mail > > > processor can get them out of there intact, nobody is going to > > > notice. (I know mail queuers should call fsync and report errors > > > when that fails, but there are bound to be applications where > > > calling fsync is not appropriate (*)) > > > > AFAIK at least Postfix MDA only reports mail as being accepted over > > SMTP once fsync() on the mail file completed successfully. And I´d > > expect every sensible MDA to do this. I don´t know how Dovecot MDA > > which I currently use for sieve support does this tough. > > Yes. That's why I added the remark that mailers will call fsync and > know about it on the write side. I encountered a situation in the > last few days that when a developer runs into this while developing, > would have caused him to write: > /* Calling this fsync causes unacceptable performance */ > // fsync (fd); Hey, I still have # KDE Sync # Re: zero size file after power failure with kernel 2.6.30.5 # http://permalink.gmane.org/gmane.comp.file-systems.xfs.general/30512 export KDE_EXTRA_FSYNC=1 in my ~/.zshrc. One reason KDE developers did this was Ext3 having been so slow with fsync(). See also: Bug 187172 - truncated configuration files on power loss or hard crash https://bugs.kde.org/187172 > But when apt-get upgrade replaces your /bin/sh and gets a write error > returning error on subsequent reads is really bad. I sometimes used eatmydata with apt upgrade / dist-upgrade, but yeah, this asks for trouble on write interruptions. > It is more difficult than you think. Heh. :) Thanks, -- Martin
Re: POSIX violation by writeback error
Rogier Wolff - 05.09.18, 10:04: > On Wed, Sep 05, 2018 at 09:39:58AM +0200, Martin Steigerwald wrote: > > Rogier Wolff - 05.09.18, 09:08: > > > So when a mail queuer puts mail the mailq files and the mail > > > processor can get them out of there intact, nobody is going to > > > notice. (I know mail queuers should call fsync and report errors > > > when that fails, but there are bound to be applications where > > > calling fsync is not appropriate (*)) > > > > AFAIK at least Postfix MDA only reports mail as being accepted over > > SMTP once fsync() on the mail file completed successfully. And I´d > > expect every sensible MDA to do this. I don´t know how Dovecot MDA > > which I currently use for sieve support does this tough. > > Yes. That's why I added the remark that mailers will call fsync and > know about it on the write side. I encountered a situation in the > last few days that when a developer runs into this while developing, > would have caused him to write: > /* Calling this fsync causes unacceptable performance */ > // fsync (fd); Hey, I still have # KDE Sync # Re: zero size file after power failure with kernel 2.6.30.5 # http://permalink.gmane.org/gmane.comp.file-systems.xfs.general/30512 export KDE_EXTRA_FSYNC=1 in my ~/.zshrc. One reason KDE developers did this was Ext3 having been so slow with fsync(). See also: Bug 187172 - truncated configuration files on power loss or hard crash https://bugs.kde.org/187172 > But when apt-get upgrade replaces your /bin/sh and gets a write error > returning error on subsequent reads is really bad. I sometimes used eatmydata with apt upgrade / dist-upgrade, but yeah, this asks for trouble on write interruptions. > It is more difficult than you think. Heh. :) Thanks, -- Martin
Re: POSIX violation by writeback error
Rogier Wolff - 05.09.18, 09:08: > So when a mail queuer puts mail the mailq files and the mail processor > can get them out of there intact, nobody is going to notice. (I know > mail queuers should call fsync and report errors when that fails, but > there are bound to be applications where calling fsync is not > appropriate (*)) AFAIK at least Postfix MDA only reports mail as being accepted over SMTP once fsync() on the mail file completed successfully. And I´d expect every sensible MDA to do this. I don´t know how Dovecot MDA which I currently use for sieve support does this tough. -- Martin
Re: POSIX violation by writeback error
Rogier Wolff - 05.09.18, 09:08: > So when a mail queuer puts mail the mailq files and the mail processor > can get them out of there intact, nobody is going to notice. (I know > mail queuers should call fsync and report errors when that fails, but > there are bound to be applications where calling fsync is not > appropriate (*)) AFAIK at least Postfix MDA only reports mail as being accepted over SMTP once fsync() on the mail file completed successfully. And I´d expect every sensible MDA to do this. I don´t know how Dovecot MDA which I currently use for sieve support does this tough. -- Martin
Re: POSIX violation by writeback error
Jeff Layton - 04.09.18, 17:44: > > - If the following read() could be served by a page in memory, just > > returns the data. If the following read() could not be served by a > > page in memory and the inode/address_space has a writeback error > > mark, returns EIO. If there is a writeback error on the file, and > > the request data could not be served > > by a page in memory, it means we are reading a (partically) > > corrupted > > (out-of-data) > > file. Receiving an EIO is expected. > > No, an error on read is not expected there. Consider this: > > Suppose the backend filesystem (maybe an NFSv3 export) is really r/o, > but was mounted r/w. An application queues up a bunch of writes that > of course can't be written back (they get EROFS or something when > they're flushed back to the server), but that application never calls > fsync. > > A completely unrelated application is running as a user that can open > the file for read, but not r/w. It then goes to open and read the file > and then gets EIO back or maybe even EROFS. > > Why should that application (which did zero writes) have any reason to > think that the error was due to prior writeback failure by a > completely separate process? Does EROFS make sense when you're > attempting to do a read anyway? > > Moreover, what is that application's remedy in this case? It just > wants to read the file, but may not be able to even open it for write > to issue an fsync to "clear" the error. How do we get things moving > again so it can do what it wants? > > I think your suggestion would open the floodgates for local DoS > attacks. I wonder whether a new error for reporting writeback errors like this could help out of the situation. But from all I read here so far, this is a really challenging situation to deal with. I still remember how AmigaOS dealt with this case and from an usability point of view it was close to ideal: If a disk was removed, like a floppy disk, a network disk provided by Envoy or even a hard disk, it pops up a dialog "You MUST insert volume again". And if you did, it continued writing. That worked even with networked devices. I tested it. I unplugged the ethernet cable and replugged it and it continued writing. I can imagine that this would be quite challenging to implement within Linux. I remember there has been a Google Summer of Code project for NetBSD at least been offered to implement this, but I never got to know whether it was taken or even implemented. If so it might serve as an inspiration. Anyway AmigaOS did this even for stationary hard disks. I had the issue of a flaky connection through IDE to SCSI and then SCSI to UWSCSI adapter. And when the hard disk had connection issues that dialog popped up, with the name of the operating system volume for example. Every access to it was blocked then. It simply blocked all processes that accessed it till it became available again (usually I rebooted in case of stationary device cause I had to open case or no hot plug available or working). But AFAIR AmigaOS also did not have a notion of caching writes for longer than maybe a few seconds or so and I think just within the device driver. Writes were (almost) immediate. There have been some asynchronous I/O libraries and I would expect an delay in the dialog popping up in that case. It would be challenging to implement for Linux even just for removable devices. You have page dirtying and delayed writeback – which is still an performance issue with NFS of 1 GBit, rsync from local storage that is faster than 1 GBit and huge files, reducing dirty memory ratio may help to halve the time needed to complete the rsync copy operation. And you would need to communicate all the way to userspace to let the user know about the issue. Still, at least for removable media, this would be almost the most usability friendly approach. With robust filesystems (Amiga Old Filesystem and Fast Filesystem was not robust in case of sudden write interruption, so the "MUST" was mean that way) one may even offer "Please insert device again to write out unwritten data or choose to discard that data" in a dialog. And for removable media it may even work as blocking processes that access it usually would not block the whole system. But for the operating system disk? I know how Plasma desktop behaves during massive I/O operations. It usually just completely stalls to a halt. It seems to me that its processes do some I/O almost all of the time … or that the Linux kernel blocks other syscalls too during heavy I/O load. I just liked to mention it as another crazy idea. But I bet it would practically need to rewrite the I/O subsystem in Linux to a great extent, probably diminishing its performance in situations of write pressure. Or maybe a genius finds a way to implement both. :) What I do think tough is that the dirty page caching of Linux with its current standard settings is excessive. 5% / 10% of available
Re: POSIX violation by writeback error
Jeff Layton - 04.09.18, 17:44: > > - If the following read() could be served by a page in memory, just > > returns the data. If the following read() could not be served by a > > page in memory and the inode/address_space has a writeback error > > mark, returns EIO. If there is a writeback error on the file, and > > the request data could not be served > > by a page in memory, it means we are reading a (partically) > > corrupted > > (out-of-data) > > file. Receiving an EIO is expected. > > No, an error on read is not expected there. Consider this: > > Suppose the backend filesystem (maybe an NFSv3 export) is really r/o, > but was mounted r/w. An application queues up a bunch of writes that > of course can't be written back (they get EROFS or something when > they're flushed back to the server), but that application never calls > fsync. > > A completely unrelated application is running as a user that can open > the file for read, but not r/w. It then goes to open and read the file > and then gets EIO back or maybe even EROFS. > > Why should that application (which did zero writes) have any reason to > think that the error was due to prior writeback failure by a > completely separate process? Does EROFS make sense when you're > attempting to do a read anyway? > > Moreover, what is that application's remedy in this case? It just > wants to read the file, but may not be able to even open it for write > to issue an fsync to "clear" the error. How do we get things moving > again so it can do what it wants? > > I think your suggestion would open the floodgates for local DoS > attacks. I wonder whether a new error for reporting writeback errors like this could help out of the situation. But from all I read here so far, this is a really challenging situation to deal with. I still remember how AmigaOS dealt with this case and from an usability point of view it was close to ideal: If a disk was removed, like a floppy disk, a network disk provided by Envoy or even a hard disk, it pops up a dialog "You MUST insert volume again". And if you did, it continued writing. That worked even with networked devices. I tested it. I unplugged the ethernet cable and replugged it and it continued writing. I can imagine that this would be quite challenging to implement within Linux. I remember there has been a Google Summer of Code project for NetBSD at least been offered to implement this, but I never got to know whether it was taken or even implemented. If so it might serve as an inspiration. Anyway AmigaOS did this even for stationary hard disks. I had the issue of a flaky connection through IDE to SCSI and then SCSI to UWSCSI adapter. And when the hard disk had connection issues that dialog popped up, with the name of the operating system volume for example. Every access to it was blocked then. It simply blocked all processes that accessed it till it became available again (usually I rebooted in case of stationary device cause I had to open case or no hot plug available or working). But AFAIR AmigaOS also did not have a notion of caching writes for longer than maybe a few seconds or so and I think just within the device driver. Writes were (almost) immediate. There have been some asynchronous I/O libraries and I would expect an delay in the dialog popping up in that case. It would be challenging to implement for Linux even just for removable devices. You have page dirtying and delayed writeback – which is still an performance issue with NFS of 1 GBit, rsync from local storage that is faster than 1 GBit and huge files, reducing dirty memory ratio may help to halve the time needed to complete the rsync copy operation. And you would need to communicate all the way to userspace to let the user know about the issue. Still, at least for removable media, this would be almost the most usability friendly approach. With robust filesystems (Amiga Old Filesystem and Fast Filesystem was not robust in case of sudden write interruption, so the "MUST" was mean that way) one may even offer "Please insert device again to write out unwritten data or choose to discard that data" in a dialog. And for removable media it may even work as blocking processes that access it usually would not block the whole system. But for the operating system disk? I know how Plasma desktop behaves during massive I/O operations. It usually just completely stalls to a halt. It seems to me that its processes do some I/O almost all of the time … or that the Linux kernel blocks other syscalls too during heavy I/O load. I just liked to mention it as another crazy idea. But I bet it would practically need to rewrite the I/O subsystem in Linux to a great extent, probably diminishing its performance in situations of write pressure. Or maybe a genius finds a way to implement both. :) What I do think tough is that the dirty page caching of Linux with its current standard settings is excessive. 5% / 10% of available
Re: Amiga RDB partition support for disks >= 2 TB
Martin Steigerwald - 28.06.18, 13:30: > jdow - 28.06.18, 12:00: > > On 20180628 01:16, Martin Steigerwald wrote: > […] > > > >> That brings to the fore an interesting question. Why bother with > > >> RDBs > > >> over 2TB unless you want a disk with one single partition? This > > >> Win10 > > >> monster I am using has a modest BIOS driver partition for the OS > > >> and > > >> a giant data partition. That smaller partition would easily work > > >> with > > >> any RDB/Filesystem combination since 2.0. So there are some good > > >> workarounds that are probably "safer" and at least as flexible as > > >> RDBs, one Linux has used for a very long time, too. > > > > > > Well, my use case was simple: > > > > > > I had this 2 TB disk and I choose to share it as a backup disk for > > > Linux *and* AmigaOS 4.x on that Sam440ep I still have next to me > > > desk here. > > > > EEK! The hair on my neck is standing up straight! Have you heard > > of SAMBA? The linux mail server firewall etc machine has an extra > > 4TB > > disk on it as a backup for the other systems, although a piddly 4TB > > is small when I save the entire 3G RAID system I have. It's a proof > > of concept so A full backup on a 1gig Ethernet still takes a > > long time. But backing up even an 18GB disk on an Amiga via > > 100Base-t isn't too bad. And disk speeds of the era being what they > > were it's about all you can do anyway. > > Heh, the thing worked just fine in Amiga OS 4. I got away with it > without an issue, until I plugged the disk to my Linux laptop and > wrote data onto the Linux file system. Mind you, I think in that > partition marked LNX\0 I even created a Linux LVM with pvcreate. Do > you call that insane? Well it probably is. :) > > And as an Amiga user I could just return to you: I clicked it, it did > not warn, so all is good :) > > But yeah, as mentioned I researched the topic before. And I think > there > has not even been an overflow within the RDB: > > The raw, theoretical limit on the maximum device capacity is about > > 2^105 bytes: > > > > 32 bit rdb_Cylinders * 32 bit rdb_Heads * 32 bit rdb_Sectors * 512 > > bytes/sector for the HD size in struct RigidDiskBlock > > http://wiki.amigaos.net/wiki/RDB_(Amiga_Rigid_Disk_Block) > > Confirmed by: > > The .ADF (Amiga Disk File) format FAQ: > http://lclevy.free.fr/adflib/adf_info.html#p6 > > But what do I write, you know the RDB format :) > > So just do the calculation in 96 Bit and you all are set :) For sectors. Or 3*32+9 (for 512 bytes per sector) = 105 bits for bytes. > Now that is a reason for 128 Bit CPUs :). > > Muuhahaha. -- Martin
Re: Amiga RDB partition support for disks >= 2 TB
Martin Steigerwald - 28.06.18, 13:30: > jdow - 28.06.18, 12:00: > > On 20180628 01:16, Martin Steigerwald wrote: > […] > > > >> That brings to the fore an interesting question. Why bother with > > >> RDBs > > >> over 2TB unless you want a disk with one single partition? This > > >> Win10 > > >> monster I am using has a modest BIOS driver partition for the OS > > >> and > > >> a giant data partition. That smaller partition would easily work > > >> with > > >> any RDB/Filesystem combination since 2.0. So there are some good > > >> workarounds that are probably "safer" and at least as flexible as > > >> RDBs, one Linux has used for a very long time, too. > > > > > > Well, my use case was simple: > > > > > > I had this 2 TB disk and I choose to share it as a backup disk for > > > Linux *and* AmigaOS 4.x on that Sam440ep I still have next to me > > > desk here. > > > > EEK! The hair on my neck is standing up straight! Have you heard > > of SAMBA? The linux mail server firewall etc machine has an extra > > 4TB > > disk on it as a backup for the other systems, although a piddly 4TB > > is small when I save the entire 3G RAID system I have. It's a proof > > of concept so A full backup on a 1gig Ethernet still takes a > > long time. But backing up even an 18GB disk on an Amiga via > > 100Base-t isn't too bad. And disk speeds of the era being what they > > were it's about all you can do anyway. > > Heh, the thing worked just fine in Amiga OS 4. I got away with it > without an issue, until I plugged the disk to my Linux laptop and > wrote data onto the Linux file system. Mind you, I think in that > partition marked LNX\0 I even created a Linux LVM with pvcreate. Do > you call that insane? Well it probably is. :) > > And as an Amiga user I could just return to you: I clicked it, it did > not warn, so all is good :) > > But yeah, as mentioned I researched the topic before. And I think > there > has not even been an overflow within the RDB: > > The raw, theoretical limit on the maximum device capacity is about > > 2^105 bytes: > > > > 32 bit rdb_Cylinders * 32 bit rdb_Heads * 32 bit rdb_Sectors * 512 > > bytes/sector for the HD size in struct RigidDiskBlock > > http://wiki.amigaos.net/wiki/RDB_(Amiga_Rigid_Disk_Block) > > Confirmed by: > > The .ADF (Amiga Disk File) format FAQ: > http://lclevy.free.fr/adflib/adf_info.html#p6 > > But what do I write, you know the RDB format :) > > So just do the calculation in 96 Bit and you all are set :) For sectors. Or 3*32+9 (for 512 bytes per sector) = 105 bits for bytes. > Now that is a reason for 128 Bit CPUs :). > > Muuhahaha. -- Martin
Re: Amiga RDB partition support for disks >= 2 TB
jdow - 28.06.18, 12:00: > On 20180628 01:16, Martin Steigerwald wrote: […] > >> That brings to the fore an interesting question. Why bother with > >> RDBs > >> over 2TB unless you want a disk with one single partition? This > >> Win10 > >> monster I am using has a modest BIOS driver partition for the OS > >> and > >> a giant data partition. That smaller partition would easily work > >> with > >> any RDB/Filesystem combination since 2.0. So there are some good > >> workarounds that are probably "safer" and at least as flexible as > >> RDBs, one Linux has used for a very long time, too. > > > > Well, my use case was simple: > > > > I had this 2 TB disk and I choose to share it as a backup disk for > > Linux *and* AmigaOS 4.x on that Sam440ep I still have next to me > > desk here. > EEK! The hair on my neck is standing up straight! Have you heard > of SAMBA? The linux mail server firewall etc machine has an extra 4TB > disk on it as a backup for the other systems, although a piddly 4TB > is small when I save the entire 3G RAID system I have. It's a proof > of concept so A full backup on a 1gig Ethernet still takes a > long time. But backing up even an 18GB disk on an Amiga via > 100Base-t isn't too bad. And disk speeds of the era being what they > were it's about all you can do anyway. Heh, the thing worked just fine in Amiga OS 4. I got away with it without an issue, until I plugged the disk to my Linux laptop and wrote data onto the Linux file system. Mind you, I think in that partition marked LNX\0 I even created a Linux LVM with pvcreate. Do you call that insane? Well it probably is. :) And as an Amiga user I could just return to you: I clicked it, it did not warn, so all is good :) But yeah, as mentioned I researched the topic before. And I think there has not even been an overflow within the RDB: > The raw, theoretical limit on the maximum device capacity is about > 2^105 bytes: > > 32 bit rdb_Cylinders * 32 bit rdb_Heads * 32 bit rdb_Sectors * 512 > bytes/sector for the HD size in struct RigidDiskBlock http://wiki.amigaos.net/wiki/RDB_(Amiga_Rigid_Disk_Block) Confirmed by: The .ADF (Amiga Disk File) format FAQ: http://lclevy.free.fr/adflib/adf_info.html#p6 But what do I write, you know the RDB format :) So just do the calculation in 96 Bit and you all are set :) Now that is a reason for 128 Bit CPUs :). Muuhahaha. Ciao, -- Martin
Re: Amiga RDB partition support for disks >= 2 TB
jdow - 28.06.18, 12:00: > On 20180628 01:16, Martin Steigerwald wrote: […] > >> That brings to the fore an interesting question. Why bother with > >> RDBs > >> over 2TB unless you want a disk with one single partition? This > >> Win10 > >> monster I am using has a modest BIOS driver partition for the OS > >> and > >> a giant data partition. That smaller partition would easily work > >> with > >> any RDB/Filesystem combination since 2.0. So there are some good > >> workarounds that are probably "safer" and at least as flexible as > >> RDBs, one Linux has used for a very long time, too. > > > > Well, my use case was simple: > > > > I had this 2 TB disk and I choose to share it as a backup disk for > > Linux *and* AmigaOS 4.x on that Sam440ep I still have next to me > > desk here. > EEK! The hair on my neck is standing up straight! Have you heard > of SAMBA? The linux mail server firewall etc machine has an extra 4TB > disk on it as a backup for the other systems, although a piddly 4TB > is small when I save the entire 3G RAID system I have. It's a proof > of concept so A full backup on a 1gig Ethernet still takes a > long time. But backing up even an 18GB disk on an Amiga via > 100Base-t isn't too bad. And disk speeds of the era being what they > were it's about all you can do anyway. Heh, the thing worked just fine in Amiga OS 4. I got away with it without an issue, until I plugged the disk to my Linux laptop and wrote data onto the Linux file system. Mind you, I think in that partition marked LNX\0 I even created a Linux LVM with pvcreate. Do you call that insane? Well it probably is. :) And as an Amiga user I could just return to you: I clicked it, it did not warn, so all is good :) But yeah, as mentioned I researched the topic before. And I think there has not even been an overflow within the RDB: > The raw, theoretical limit on the maximum device capacity is about > 2^105 bytes: > > 32 bit rdb_Cylinders * 32 bit rdb_Heads * 32 bit rdb_Sectors * 512 > bytes/sector for the HD size in struct RigidDiskBlock http://wiki.amigaos.net/wiki/RDB_(Amiga_Rigid_Disk_Block) Confirmed by: The .ADF (Amiga Disk File) format FAQ: http://lclevy.free.fr/adflib/adf_info.html#p6 But what do I write, you know the RDB format :) So just do the calculation in 96 Bit and you all are set :) Now that is a reason for 128 Bit CPUs :). Muuhahaha. Ciao, -- Martin
Amiga RDB partition support for disks >= 2 TB (was: Re: moving affs + RDB partition support to staging?)
ses that overflow on disks 2 TB and larger is a bug, > >>>>>>> causing > >>>>>>> the > >>>>>>> issues Martin reported. Your patch addresses that by using the > >>>>>>> correct > >>>>>>> data type for the calculations (as do other partition parsers > >>>>>>> that > >>>>>>> may > >>>>>>> have to deal with large disks) and fixes Martin's bug, so > >>>>>>> appears > >>>>>>> to be > >>>>>>> the right thing to do. > >>>>>>> > >>>>>>> Using 64 bit data types for disks smaller than 2 TB where > >>>>>>> calculations > >>>>>>> don't currently overflow is not expected to cause new issues, > >>>>>>> other > >>>>>>> than > >>>>>>> enabling use of disk and partitions larger than 2 TB (which > >>>>>>> may have > >>>>>>> ramifications with filesystems on these partitions). So > >>>>>>> comptibility is > >>>>>>> preserved. > >>>>>>> > >>>>>>> Forcing larger block sizes might be a good strategy to avoid > >>>>>>> overflow > >>>>>>> issues in filesystems as well, but I can't see how the block > >>>>>>> size > >>>>>>> stored > >>>>>>> in the RDB would enforce use of the same block size in > >>>>>>> filesystems. > >>>>>>> We'll have to rely on the filesystem tools to get that right, > >>>>>>> too. > >>>>>>> Linux > >>>>>>> AFFS does allow block sizes up to 4k (VFS limitation) so this > >>>>>>> should > >>>>>>> allow partitions larger than 2 TB to work already (but I > >>>>>>> suspect Al > >>>>>>> Viro > >>>>>>> may have found a few issues when he looked at the AFFS code so > >>>>>>> I > >>>>>>> won't > >>>>>>> say more). Anyway partitioning tools and filesystems are > >>>>>>> unrelated to > >>>>>>> the Linux partition parser code which is all we aim to fix in > >>>>>>> this > >>>>>>> patch. > >>>>>>> > >>>>>>> If you feel strongly about unknown ramifications of any > >>>>>>> filesystems on > >>>>>>> partitions larger than 2 TB, say so and I'll have the kernel > >>>>>>> print a > >>>>>>> warning about these partitions. > >>>>>>> > >>>>>>> I'll get this patch tested on Martin's test case image as well > >>>>>>> as > >>>>>>> on a > >>>>>>> RDB image from a disk known to currently work under Linux > >>>>>>> (thanks > >>>>>>> Geert > >>>>>>> for the losetup hint). Can't do much more without procuring a > >>>>>>> working > >>>>>>> Amiga disk image to use with an emulator, sorry. The Amiga I > >>>>>>> plan to > >>>>>>> use > >>>>>>> for tests is a long way away from my home indeed. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> > >>>>>>> Michael > >>>>>>> > >>>>>>> Am 26.06.18 um 17:17 schrieb jdow: > >>>>>>>> As long as it preserves compatibility it should be OK, I > >>>>>>>> suppose. > >>>>>>>> Personally I'd make any partitioning tool front end gently > >>>>>>>> force the > >>>>>>>> block size towards 8k as the disk size gets larger. The file > >>>>>>>> systems > >>>>>>>> may also run into 2TB issues that are not obvious. An unused > >>>>>>>> blocks > >>>>>>>> list will have to go beyond a uint32_t size, for example. But > >>>>>>>> a > >>>>>>>> block > >>>>>>>> list (OFS for sure, don't remember for the newer AFS) uses a > >>>>>>>> tad > >>>>>>>> under > >>>>>>>> 1% of the disk all by itself. A block bitmap is not quite so > >>>>>>>> bad. > >>>>>>>> {^_-} > >>>>>>>> > >>>>>>>> Just be sure you are aware of all the ramifications when you > >>>>>>>> make a > >>>>>>>> change. I remember thinking about this for awhile and then > >>>>>>>> determining > >>>>>>>> I REALLY did not want to think about it as my brain was > >>>>>>>> getting tied > >>>>>>>> into a gordian knot. > >>>>>>>> > >>>>>>>> {^_^} > >>>>>>>> > >>>>>>>> On 20180625 19:23, Michael Schmitz wrote: > >>>>>>>>> Joanne, > >>>>>>>>> > >>>>>>>>> Martin's boot log (including your patch) says: > >>>>>>>>> > >>>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK > >>>>>>>>> (512) > >>>>>>>>> sdb1 > >>>>>>>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 > >>>>>>>>> (DOS^C)(res > >>>>>>>>> 2 spb > >>>>>>>>> 4) > >>>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: > >>>>>>>>> [sdb] > >>>>>>>>> Attached SCSI disk > >>>>>>>>> > >>>>>>>>> so it's indeed a case of self inflicted damage (RDSK (512) > >>>>>>>>> means > >>>>>>>>> 512 > >>>>>>>>> byte blocks) and can be worked around by using a different > >>>>>>>>> block > >>>>>>>>> size. > >>>>>>>>> > >>>>>>>>> Your memory serves right indeed - blocksize is in 512 bytes > >>>>>>>>> units. > >>>>>>>>> I'll still submit a patch to Jens anyway as this may bite > >>>>>>>>> others > >>>>>>>>> yet. > >>>>>>>>> > >>>>>>>>> Cheers, > >>>>>>>>> > >>>>>>>>> Michael > >>>>>>>>> > >>>>>>>>> On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >>>>>>>>>> BTW - anybody who uses 512 byte blocks with an Amiga file > >>>>>>>>>> system is > >>>>>>>>>> a famn > >>>>>>>>>> dool. > >>>>>>>>>> > >>>>>>>>>> If memory serves the RDBs think in blocks rather than bytes > >>>>>>>>>> so it > >>>>>>>>>> should > >>>>>>>>>> work up to 2 gigablocks whatever your block size is. 512 > >>>>>>>>>> blocks is > >>>>>>>>>> 219902322 bytes. But that wastes just a WHOLE LOT of > >>>>>>>>>> disk in > >>>>>>>>>> block maps. > >>>>>>>>>> Go up to 4096 or 8192. The latter is 35 TB. > >>>>>>>>>> > >>>>>>>>>> {^_^} > >>>>>>>>>> > >>>>>>>>>> On 20180624 02:06, Martin Steigerwald wrote: > >>>>>>>>>>> Hi. > >>>>>>>>>>> > >>>>>>>>>>> Michael Schmitz - 27.04.18, 04:11: > >>>>>>>>>>>> test results at > >>>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >>>>>>>>>>>> indicate the RDB parser bug is fixed by the patch given > >>>>>>>>>>>> there, > >>>>>>>>>>>> so if > >>>>>>>>>>>> Martin now submits the patch, all should be well? > >>>>>>>>>>> > >>>>>>>>>>> Ok, better be honest than having anyone waiting for it: > >>>>>>>>>>> > >>>>>>>>>>> I do not care enough about this, in order to motivate > >>>>>>>>>>> myself > >>>>>>>>>>> preparing > >>>>>>>>>>> the a patch from Joanne Dow´s fix. > >>>>>>>>>>> > >>>>>>>>>>> I am not even using my Amiga boxes anymore, not even the > >>>>>>>>>>> Sam440ep > >>>>>>>>>>> which > >>>>>>>>>>> I still have in my apartment. > >>>>>>>>>>> > >>>>>>>>>>> So RDB support in Linux it remains broken for disks larger > >>>>>>>>>>> 2 TB, > >>>>>>>>>>> unless > >>>>>>>>>>> someone else does. > >>>>>>>>>>> > >>>>>>>>>>> Thanks. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-m68k" > in the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Martin
Amiga RDB partition support for disks >= 2 TB (was: Re: moving affs + RDB partition support to staging?)
ses that overflow on disks 2 TB and larger is a bug, > >>>>>>> causing > >>>>>>> the > >>>>>>> issues Martin reported. Your patch addresses that by using the > >>>>>>> correct > >>>>>>> data type for the calculations (as do other partition parsers > >>>>>>> that > >>>>>>> may > >>>>>>> have to deal with large disks) and fixes Martin's bug, so > >>>>>>> appears > >>>>>>> to be > >>>>>>> the right thing to do. > >>>>>>> > >>>>>>> Using 64 bit data types for disks smaller than 2 TB where > >>>>>>> calculations > >>>>>>> don't currently overflow is not expected to cause new issues, > >>>>>>> other > >>>>>>> than > >>>>>>> enabling use of disk and partitions larger than 2 TB (which > >>>>>>> may have > >>>>>>> ramifications with filesystems on these partitions). So > >>>>>>> comptibility is > >>>>>>> preserved. > >>>>>>> > >>>>>>> Forcing larger block sizes might be a good strategy to avoid > >>>>>>> overflow > >>>>>>> issues in filesystems as well, but I can't see how the block > >>>>>>> size > >>>>>>> stored > >>>>>>> in the RDB would enforce use of the same block size in > >>>>>>> filesystems. > >>>>>>> We'll have to rely on the filesystem tools to get that right, > >>>>>>> too. > >>>>>>> Linux > >>>>>>> AFFS does allow block sizes up to 4k (VFS limitation) so this > >>>>>>> should > >>>>>>> allow partitions larger than 2 TB to work already (but I > >>>>>>> suspect Al > >>>>>>> Viro > >>>>>>> may have found a few issues when he looked at the AFFS code so > >>>>>>> I > >>>>>>> won't > >>>>>>> say more). Anyway partitioning tools and filesystems are > >>>>>>> unrelated to > >>>>>>> the Linux partition parser code which is all we aim to fix in > >>>>>>> this > >>>>>>> patch. > >>>>>>> > >>>>>>> If you feel strongly about unknown ramifications of any > >>>>>>> filesystems on > >>>>>>> partitions larger than 2 TB, say so and I'll have the kernel > >>>>>>> print a > >>>>>>> warning about these partitions. > >>>>>>> > >>>>>>> I'll get this patch tested on Martin's test case image as well > >>>>>>> as > >>>>>>> on a > >>>>>>> RDB image from a disk known to currently work under Linux > >>>>>>> (thanks > >>>>>>> Geert > >>>>>>> for the losetup hint). Can't do much more without procuring a > >>>>>>> working > >>>>>>> Amiga disk image to use with an emulator, sorry. The Amiga I > >>>>>>> plan to > >>>>>>> use > >>>>>>> for tests is a long way away from my home indeed. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> > >>>>>>> Michael > >>>>>>> > >>>>>>> Am 26.06.18 um 17:17 schrieb jdow: > >>>>>>>> As long as it preserves compatibility it should be OK, I > >>>>>>>> suppose. > >>>>>>>> Personally I'd make any partitioning tool front end gently > >>>>>>>> force the > >>>>>>>> block size towards 8k as the disk size gets larger. The file > >>>>>>>> systems > >>>>>>>> may also run into 2TB issues that are not obvious. An unused > >>>>>>>> blocks > >>>>>>>> list will have to go beyond a uint32_t size, for example. But > >>>>>>>> a > >>>>>>>> block > >>>>>>>> list (OFS for sure, don't remember for the newer AFS) uses a > >>>>>>>> tad > >>>>>>>> under > >>>>>>>> 1% of the disk all by itself. A block bitmap is not quite so > >>>>>>>> bad. > >>>>>>>> {^_-} > >>>>>>>> > >>>>>>>> Just be sure you are aware of all the ramifications when you > >>>>>>>> make a > >>>>>>>> change. I remember thinking about this for awhile and then > >>>>>>>> determining > >>>>>>>> I REALLY did not want to think about it as my brain was > >>>>>>>> getting tied > >>>>>>>> into a gordian knot. > >>>>>>>> > >>>>>>>> {^_^} > >>>>>>>> > >>>>>>>> On 20180625 19:23, Michael Schmitz wrote: > >>>>>>>>> Joanne, > >>>>>>>>> > >>>>>>>>> Martin's boot log (including your patch) says: > >>>>>>>>> > >>>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK > >>>>>>>>> (512) > >>>>>>>>> sdb1 > >>>>>>>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 > >>>>>>>>> (DOS^C)(res > >>>>>>>>> 2 spb > >>>>>>>>> 4) > >>>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: > >>>>>>>>> [sdb] > >>>>>>>>> Attached SCSI disk > >>>>>>>>> > >>>>>>>>> so it's indeed a case of self inflicted damage (RDSK (512) > >>>>>>>>> means > >>>>>>>>> 512 > >>>>>>>>> byte blocks) and can be worked around by using a different > >>>>>>>>> block > >>>>>>>>> size. > >>>>>>>>> > >>>>>>>>> Your memory serves right indeed - blocksize is in 512 bytes > >>>>>>>>> units. > >>>>>>>>> I'll still submit a patch to Jens anyway as this may bite > >>>>>>>>> others > >>>>>>>>> yet. > >>>>>>>>> > >>>>>>>>> Cheers, > >>>>>>>>> > >>>>>>>>> Michael > >>>>>>>>> > >>>>>>>>> On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >>>>>>>>>> BTW - anybody who uses 512 byte blocks with an Amiga file > >>>>>>>>>> system is > >>>>>>>>>> a famn > >>>>>>>>>> dool. > >>>>>>>>>> > >>>>>>>>>> If memory serves the RDBs think in blocks rather than bytes > >>>>>>>>>> so it > >>>>>>>>>> should > >>>>>>>>>> work up to 2 gigablocks whatever your block size is. 512 > >>>>>>>>>> blocks is > >>>>>>>>>> 219902322 bytes. But that wastes just a WHOLE LOT of > >>>>>>>>>> disk in > >>>>>>>>>> block maps. > >>>>>>>>>> Go up to 4096 or 8192. The latter is 35 TB. > >>>>>>>>>> > >>>>>>>>>> {^_^} > >>>>>>>>>> > >>>>>>>>>> On 20180624 02:06, Martin Steigerwald wrote: > >>>>>>>>>>> Hi. > >>>>>>>>>>> > >>>>>>>>>>> Michael Schmitz - 27.04.18, 04:11: > >>>>>>>>>>>> test results at > >>>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >>>>>>>>>>>> indicate the RDB parser bug is fixed by the patch given > >>>>>>>>>>>> there, > >>>>>>>>>>>> so if > >>>>>>>>>>>> Martin now submits the patch, all should be well? > >>>>>>>>>>> > >>>>>>>>>>> Ok, better be honest than having anyone waiting for it: > >>>>>>>>>>> > >>>>>>>>>>> I do not care enough about this, in order to motivate > >>>>>>>>>>> myself > >>>>>>>>>>> preparing > >>>>>>>>>>> the a patch from Joanne Dow´s fix. > >>>>>>>>>>> > >>>>>>>>>>> I am not even using my Amiga boxes anymore, not even the > >>>>>>>>>>> Sam440ep > >>>>>>>>>>> which > >>>>>>>>>>> I still have in my apartment. > >>>>>>>>>>> > >>>>>>>>>>> So RDB support in Linux it remains broken for disks larger > >>>>>>>>>>> 2 TB, > >>>>>>>>>>> unless > >>>>>>>>>>> someone else does. > >>>>>>>>>>> > >>>>>>>>>>> Thanks. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-m68k" > in the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Martin
Amiga RDB partition support for disks >= 2 TB (was: Re: moving affs + RDB partition support to staging?)
filesystems as well, but I can't see how the block > >>>>>> size > >>>>>> stored > >>>>>> in the RDB would enforce use of the same block size in > >>>>>> filesystems. > >>>>>> We'll have to rely on the filesystem tools to get that right, > >>>>>> too. > >>>>>> Linux > >>>>>> AFFS does allow block sizes up to 4k (VFS limitation) so this > >>>>>> should > >>>>>> allow partitions larger than 2 TB to work already (but I > >>>>>> suspect Al > >>>>>> Viro > >>>>>> may have found a few issues when he looked at the AFFS code so > >>>>>> I > >>>>>> won't > >>>>>> say more). Anyway partitioning tools and filesystems are > >>>>>> unrelated to > >>>>>> the Linux partition parser code which is all we aim to fix in > >>>>>> this > >>>>>> patch. > >>>>>> > >>>>>> If you feel strongly about unknown ramifications of any > >>>>>> filesystems on > >>>>>> partitions larger than 2 TB, say so and I'll have the kernel > >>>>>> print a > >>>>>> warning about these partitions. > >>>>>> > >>>>>> I'll get this patch tested on Martin's test case image as well > >>>>>> as > >>>>>> on a > >>>>>> RDB image from a disk known to currently work under Linux > >>>>>> (thanks > >>>>>> Geert > >>>>>> for the losetup hint). Can't do much more without procuring a > >>>>>> working > >>>>>> Amiga disk image to use with an emulator, sorry. The Amiga I > >>>>>> plan to > >>>>>> use > >>>>>> for tests is a long way away from my home indeed. > >>>>>> > >>>>>> Cheers, > >>>>>> > >>>>>> Michael > >>>>>> > >>>>>> Am 26.06.18 um 17:17 schrieb jdow: > >>>>>>> As long as it preserves compatibility it should be OK, I > >>>>>>> suppose. > >>>>>>> Personally I'd make any partitioning tool front end gently > >>>>>>> force the > >>>>>>> block size towards 8k as the disk size gets larger. The file > >>>>>>> systems > >>>>>>> may also run into 2TB issues that are not obvious. An unused > >>>>>>> blocks > >>>>>>> list will have to go beyond a uint32_t size, for example. But > >>>>>>> a > >>>>>>> block > >>>>>>> list (OFS for sure, don't remember for the newer AFS) uses a > >>>>>>> tad > >>>>>>> under > >>>>>>> 1% of the disk all by itself. A block bitmap is not quite so > >>>>>>> bad. > >>>>>>> {^_-} > >>>>>>> > >>>>>>> Just be sure you are aware of all the ramifications when you > >>>>>>> make a > >>>>>>> change. I remember thinking about this for awhile and then > >>>>>>> determining > >>>>>>> I REALLY did not want to think about it as my brain was > >>>>>>> getting tied > >>>>>>> into a gordian knot. > >>>>>>> > >>>>>>> {^_^} > >>>>>>> > >>>>>>> On 20180625 19:23, Michael Schmitz wrote: > >>>>>>>> Joanne, > >>>>>>>> > >>>>>>>> Martin's boot log (including your patch) says: > >>>>>>>> > >>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK > >>>>>>>> (512) > >>>>>>>> sdb1 > >>>>>>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 > >>>>>>>> (DOS^C)(res > >>>>>>>> 2 spb > >>>>>>>> 4) > >>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: > >>>>>>>> [sdb] > >>>>>>>> Attached SCSI disk > >>>>>>>> > >>>>>>>> so it's indeed a case of self inflicted damage (RDSK (512) > >>>>>>>> means > >>>>>>>> 512 > >>>>>>>> byte blocks) and can be worked around by using a different > >>>>>>>> block > >>>>>>>> size. > >>>>>>>> > >>>>>>>> Your memory serves right indeed - blocksize is in 512 bytes > >>>>>>>> units. > >>>>>>>> I'll still submit a patch to Jens anyway as this may bite > >>>>>>>> others > >>>>>>>> yet. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> > >>>>>>>> Michael > >>>>>>>> > >>>>>>>> On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >>>>>>>>> BTW - anybody who uses 512 byte blocks with an Amiga file > >>>>>>>>> system is > >>>>>>>>> a famn > >>>>>>>>> dool. > >>>>>>>>> > >>>>>>>>> If memory serves the RDBs think in blocks rather than bytes > >>>>>>>>> so it > >>>>>>>>> should > >>>>>>>>> work up to 2 gigablocks whatever your block size is. 512 > >>>>>>>>> blocks is > >>>>>>>>> 219902322 bytes. But that wastes just a WHOLE LOT of > >>>>>>>>> disk in > >>>>>>>>> block maps. > >>>>>>>>> Go up to 4096 or 8192. The latter is 35 TB. > >>>>>>>>> > >>>>>>>>> {^_^} > >>>>>>>>> > >>>>>>>>> On 20180624 02:06, Martin Steigerwald wrote: > >>>>>>>>>> Hi. > >>>>>>>>>> > >>>>>>>>>> Michael Schmitz - 27.04.18, 04:11: > >>>>>>>>>>> test results at > >>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >>>>>>>>>>> indicate the RDB parser bug is fixed by the patch given > >>>>>>>>>>> there, > >>>>>>>>>>> so if > >>>>>>>>>>> Martin now submits the patch, all should be well? > >>>>>>>>>> > >>>>>>>>>> Ok, better be honest than having anyone waiting for it: > >>>>>>>>>> > >>>>>>>>>> I do not care enough about this, in order to motivate > >>>>>>>>>> myself > >>>>>>>>>> preparing > >>>>>>>>>> the a patch from Joanne Dow´s fix. > >>>>>>>>>> > >>>>>>>>>> I am not even using my Amiga boxes anymore, not even the > >>>>>>>>>> Sam440ep > >>>>>>>>>> which > >>>>>>>>>> I still have in my apartment. > >>>>>>>>>> > >>>>>>>>>> So RDB support in Linux it remains broken for disks larger > >>>>>>>>>> 2 TB, > >>>>>>>>>> unless > >>>>>>>>>> someone else does. > >>>>>>>>>> > >>>>>>>>>> Thanks. > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>>>>>> linux-m68k" in > >>>>>>>>> the body of a message to majord...@vger.kernel.org > >>>>>>>>> More majordomo info at > >>>>>>>>> http://vger.kernel.org/majordomo-info.html > >>>>>>> > >>>>>>> -- > >>>>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>>>> linux-m68k" in > >>>>>>> the body of a message to majord...@vger.kernel.org > >>>>>>> More majordomo info at > >>>>>>> http://vger.kernel.org/majordomo-info.html > >>>>> > >>>>> -- > >>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>> linux-m68k" in > >>>>> the body of a message to majord...@vger.kernel.org > >>>>> More majordomo info at > >>>>> http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-m68k" > in the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Martin
Amiga RDB partition support for disks >= 2 TB (was: Re: moving affs + RDB partition support to staging?)
filesystems as well, but I can't see how the block > >>>>>> size > >>>>>> stored > >>>>>> in the RDB would enforce use of the same block size in > >>>>>> filesystems. > >>>>>> We'll have to rely on the filesystem tools to get that right, > >>>>>> too. > >>>>>> Linux > >>>>>> AFFS does allow block sizes up to 4k (VFS limitation) so this > >>>>>> should > >>>>>> allow partitions larger than 2 TB to work already (but I > >>>>>> suspect Al > >>>>>> Viro > >>>>>> may have found a few issues when he looked at the AFFS code so > >>>>>> I > >>>>>> won't > >>>>>> say more). Anyway partitioning tools and filesystems are > >>>>>> unrelated to > >>>>>> the Linux partition parser code which is all we aim to fix in > >>>>>> this > >>>>>> patch. > >>>>>> > >>>>>> If you feel strongly about unknown ramifications of any > >>>>>> filesystems on > >>>>>> partitions larger than 2 TB, say so and I'll have the kernel > >>>>>> print a > >>>>>> warning about these partitions. > >>>>>> > >>>>>> I'll get this patch tested on Martin's test case image as well > >>>>>> as > >>>>>> on a > >>>>>> RDB image from a disk known to currently work under Linux > >>>>>> (thanks > >>>>>> Geert > >>>>>> for the losetup hint). Can't do much more without procuring a > >>>>>> working > >>>>>> Amiga disk image to use with an emulator, sorry. The Amiga I > >>>>>> plan to > >>>>>> use > >>>>>> for tests is a long way away from my home indeed. > >>>>>> > >>>>>> Cheers, > >>>>>> > >>>>>> Michael > >>>>>> > >>>>>> Am 26.06.18 um 17:17 schrieb jdow: > >>>>>>> As long as it preserves compatibility it should be OK, I > >>>>>>> suppose. > >>>>>>> Personally I'd make any partitioning tool front end gently > >>>>>>> force the > >>>>>>> block size towards 8k as the disk size gets larger. The file > >>>>>>> systems > >>>>>>> may also run into 2TB issues that are not obvious. An unused > >>>>>>> blocks > >>>>>>> list will have to go beyond a uint32_t size, for example. But > >>>>>>> a > >>>>>>> block > >>>>>>> list (OFS for sure, don't remember for the newer AFS) uses a > >>>>>>> tad > >>>>>>> under > >>>>>>> 1% of the disk all by itself. A block bitmap is not quite so > >>>>>>> bad. > >>>>>>> {^_-} > >>>>>>> > >>>>>>> Just be sure you are aware of all the ramifications when you > >>>>>>> make a > >>>>>>> change. I remember thinking about this for awhile and then > >>>>>>> determining > >>>>>>> I REALLY did not want to think about it as my brain was > >>>>>>> getting tied > >>>>>>> into a gordian knot. > >>>>>>> > >>>>>>> {^_^} > >>>>>>> > >>>>>>> On 20180625 19:23, Michael Schmitz wrote: > >>>>>>>> Joanne, > >>>>>>>> > >>>>>>>> Martin's boot log (including your patch) says: > >>>>>>>> > >>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK > >>>>>>>> (512) > >>>>>>>> sdb1 > >>>>>>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 > >>>>>>>> (DOS^C)(res > >>>>>>>> 2 spb > >>>>>>>> 4) > >>>>>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: > >>>>>>>> [sdb] > >>>>>>>> Attached SCSI disk > >>>>>>>> > >>>>>>>> so it's indeed a case of self inflicted damage (RDSK (512) > >>>>>>>> means > >>>>>>>> 512 > >>>>>>>> byte blocks) and can be worked around by using a different > >>>>>>>> block > >>>>>>>> size. > >>>>>>>> > >>>>>>>> Your memory serves right indeed - blocksize is in 512 bytes > >>>>>>>> units. > >>>>>>>> I'll still submit a patch to Jens anyway as this may bite > >>>>>>>> others > >>>>>>>> yet. > >>>>>>>> > >>>>>>>> Cheers, > >>>>>>>> > >>>>>>>> Michael > >>>>>>>> > >>>>>>>> On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >>>>>>>>> BTW - anybody who uses 512 byte blocks with an Amiga file > >>>>>>>>> system is > >>>>>>>>> a famn > >>>>>>>>> dool. > >>>>>>>>> > >>>>>>>>> If memory serves the RDBs think in blocks rather than bytes > >>>>>>>>> so it > >>>>>>>>> should > >>>>>>>>> work up to 2 gigablocks whatever your block size is. 512 > >>>>>>>>> blocks is > >>>>>>>>> 219902322 bytes. But that wastes just a WHOLE LOT of > >>>>>>>>> disk in > >>>>>>>>> block maps. > >>>>>>>>> Go up to 4096 or 8192. The latter is 35 TB. > >>>>>>>>> > >>>>>>>>> {^_^} > >>>>>>>>> > >>>>>>>>> On 20180624 02:06, Martin Steigerwald wrote: > >>>>>>>>>> Hi. > >>>>>>>>>> > >>>>>>>>>> Michael Schmitz - 27.04.18, 04:11: > >>>>>>>>>>> test results at > >>>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >>>>>>>>>>> indicate the RDB parser bug is fixed by the patch given > >>>>>>>>>>> there, > >>>>>>>>>>> so if > >>>>>>>>>>> Martin now submits the patch, all should be well? > >>>>>>>>>> > >>>>>>>>>> Ok, better be honest than having anyone waiting for it: > >>>>>>>>>> > >>>>>>>>>> I do not care enough about this, in order to motivate > >>>>>>>>>> myself > >>>>>>>>>> preparing > >>>>>>>>>> the a patch from Joanne Dow´s fix. > >>>>>>>>>> > >>>>>>>>>> I am not even using my Amiga boxes anymore, not even the > >>>>>>>>>> Sam440ep > >>>>>>>>>> which > >>>>>>>>>> I still have in my apartment. > >>>>>>>>>> > >>>>>>>>>> So RDB support in Linux it remains broken for disks larger > >>>>>>>>>> 2 TB, > >>>>>>>>>> unless > >>>>>>>>>> someone else does. > >>>>>>>>>> > >>>>>>>>>> Thanks. > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>>>>>> linux-m68k" in > >>>>>>>>> the body of a message to majord...@vger.kernel.org > >>>>>>>>> More majordomo info at > >>>>>>>>> http://vger.kernel.org/majordomo-info.html > >>>>>>> > >>>>>>> -- > >>>>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>>>> linux-m68k" in > >>>>>>> the body of a message to majord...@vger.kernel.org > >>>>>>> More majordomo info at > >>>>>>> http://vger.kernel.org/majordomo-info.html > >>>>> > >>>>> -- > >>>>> To unsubscribe from this list: send the line "unsubscribe > >>>>> linux-m68k" in > >>>>> the body of a message to majord...@vger.kernel.org > >>>>> More majordomo info at > >>>>> http://vger.kernel.org/majordomo-info.html > > -- > To unsubscribe from this list: send the line "unsubscribe linux-m68k" > in the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Martin
Re: Amiga RDB partition support for disks >= 2 TB (was: Re: moving affs + RDB partition support to staging?)
Changing subject, so that there is at least a chance for someone to find this discussions with a search engine :) Joanne, jdow - 28.06.18, 04:57: > The issue is what happens when one of those disks appears on a 3.1 > system. {^_^} That is right, so I think the warning about 64 bit support in native OS is okay, but that issue already exists *without* Linux. Remember, I created that RDB with Media Toolbox on AmigaOS 4.0. I did not even use Linux to create that beast :). If it booms in AmigaOS < 4 without NSD64, TD64 or SCSI direct, that would happen with or without the warning in Linux, even without the disk ever have been seen by a Linux kernel. I´d say the warning about support in native OS does not harm, even when it is about educating Amiga users who, in case they use that large drives – and I pretty much bet, some of them do –, better know the limitations beforehand. I do think the extra kernel option does not make all that much sense, but I accept it anyway. Cause if you argue like that, what would need fixing is AmigaOS < 4 without NSD64, TD64 or SCSI direct, but then that is what NSD64 and TD64 was made for more than 10 years ago. Of course, if a partitioning tool for Linux ever allows to create such an RDB, it makes sense to add a big fat warning about that. As… I think would make sense to have in Media Toolbox and AmigaOS partitioning tools. However Linux here just reads the RDB, so I´d personally go with the warning about support in native OS, but spare myself the extra kernel option stuff. It is Michael´s call tough, as he submits the patch. And if he chooses to be on a safer side than this, that is fine with me. Thanks, Martin > On 20180627 01:03, Martin Steigerwald wrote: > > Dear Joanne. > > > > jdow - 27.06.18, 08:24: > >> You allergic to using a GPT solution? It will get away from some of > >> the evils that RDB has inherent in it because they are also > >> features? > >> (Loading a filesystem or DriveInit code from RDBs is just asking > >> for > >> a nearly impossible to remove malware infection.) Furthermore, any > >> 32 > >> bit system that sees an RDSK block is going to try to translate it. > >> If you add a new RDB format you are going to get bizarre and > >> probably > >> quite destructive results from the mistake. Fail safe is a rather > >> good notion, methinks. > >> > >> Personally I figure this is all rather surreal. 2TG of junk on an > >> Amiga system seems utterly outlandish to me. You cited another > >> overflow potential. There are at least three we've identified, I > >> believe. Are you 100% sure there are no more? The specific one you > >> mention of translating RDB to Linux has a proper solution in the > >> RDB > >> reader. It should recover such overflow errors in the RDB as it can > >> with due care and polish. It should flag any other overflow error > >> it > >> detects within the RDBs and return an error such as to leave the > >> disk > >> unmounted or mounted read-only if you feel like messing up a poor > >> sod's backups. The simple solution is to read each of the variables > >> with the nominal RDB size and convert it to uint64_t before > >> calculating byte indices. > >> > >> However, consider my inputs as advice from an adult who has seen > >> the > >> Amiga Elephant so to speak. I am not trying to assert any control. > >> Do > >> as you wish; but, I would plead with you to avoid ANY chance you > >> can > >> for the user to make a bonehead stupid move and lose all his > >> treasured disk archives. Doing otherwise is very poor form. > > > > I am pretty confident that larger than 2 TiB disks are fully > > supported within AmigaOS 4, as I outlined in my other mail. > > > > So with all due respect: I used a larger than 2 TiB disk in AmigaOS > > 4 in 2012 already *just* fine. I even found I had the same > > questions back then, and researched it. Which lead to this official > > article back then: > > > > http://wiki.amigaos.net/wiki/RDB > > > > I am also pretty sure that AmigaOS still uses RDB as partitioning > > format. They support MBR. I don´t think AmigaOS 4.1 supports GPT. > > Whether to implement that of course is the decision of AmigaOS 4 > > development team. I am no longer a member of it since some time. > > > > Linux m68k should already be able to use disks in GPT format, but > > you > > likely won´t be able to read them in AmigaOS, unless there is some > > third party support for it meanwhile. > > > > Thanks, > > Martin >
Re: Amiga RDB partition support for disks >= 2 TB (was: Re: moving affs + RDB partition support to staging?)
Changing subject, so that there is at least a chance for someone to find this discussions with a search engine :) Joanne, jdow - 28.06.18, 04:57: > The issue is what happens when one of those disks appears on a 3.1 > system. {^_^} That is right, so I think the warning about 64 bit support in native OS is okay, but that issue already exists *without* Linux. Remember, I created that RDB with Media Toolbox on AmigaOS 4.0. I did not even use Linux to create that beast :). If it booms in AmigaOS < 4 without NSD64, TD64 or SCSI direct, that would happen with or without the warning in Linux, even without the disk ever have been seen by a Linux kernel. I´d say the warning about support in native OS does not harm, even when it is about educating Amiga users who, in case they use that large drives – and I pretty much bet, some of them do –, better know the limitations beforehand. I do think the extra kernel option does not make all that much sense, but I accept it anyway. Cause if you argue like that, what would need fixing is AmigaOS < 4 without NSD64, TD64 or SCSI direct, but then that is what NSD64 and TD64 was made for more than 10 years ago. Of course, if a partitioning tool for Linux ever allows to create such an RDB, it makes sense to add a big fat warning about that. As… I think would make sense to have in Media Toolbox and AmigaOS partitioning tools. However Linux here just reads the RDB, so I´d personally go with the warning about support in native OS, but spare myself the extra kernel option stuff. It is Michael´s call tough, as he submits the patch. And if he chooses to be on a safer side than this, that is fine with me. Thanks, Martin > On 20180627 01:03, Martin Steigerwald wrote: > > Dear Joanne. > > > > jdow - 27.06.18, 08:24: > >> You allergic to using a GPT solution? It will get away from some of > >> the evils that RDB has inherent in it because they are also > >> features? > >> (Loading a filesystem or DriveInit code from RDBs is just asking > >> for > >> a nearly impossible to remove malware infection.) Furthermore, any > >> 32 > >> bit system that sees an RDSK block is going to try to translate it. > >> If you add a new RDB format you are going to get bizarre and > >> probably > >> quite destructive results from the mistake. Fail safe is a rather > >> good notion, methinks. > >> > >> Personally I figure this is all rather surreal. 2TG of junk on an > >> Amiga system seems utterly outlandish to me. You cited another > >> overflow potential. There are at least three we've identified, I > >> believe. Are you 100% sure there are no more? The specific one you > >> mention of translating RDB to Linux has a proper solution in the > >> RDB > >> reader. It should recover such overflow errors in the RDB as it can > >> with due care and polish. It should flag any other overflow error > >> it > >> detects within the RDBs and return an error such as to leave the > >> disk > >> unmounted or mounted read-only if you feel like messing up a poor > >> sod's backups. The simple solution is to read each of the variables > >> with the nominal RDB size and convert it to uint64_t before > >> calculating byte indices. > >> > >> However, consider my inputs as advice from an adult who has seen > >> the > >> Amiga Elephant so to speak. I am not trying to assert any control. > >> Do > >> as you wish; but, I would plead with you to avoid ANY chance you > >> can > >> for the user to make a bonehead stupid move and lose all his > >> treasured disk archives. Doing otherwise is very poor form. > > > > I am pretty confident that larger than 2 TiB disks are fully > > supported within AmigaOS 4, as I outlined in my other mail. > > > > So with all due respect: I used a larger than 2 TiB disk in AmigaOS > > 4 in 2012 already *just* fine. I even found I had the same > > questions back then, and researched it. Which lead to this official > > article back then: > > > > http://wiki.amigaos.net/wiki/RDB > > > > I am also pretty sure that AmigaOS still uses RDB as partitioning > > format. They support MBR. I don´t think AmigaOS 4.1 supports GPT. > > Whether to implement that of course is the decision of AmigaOS 4 > > development team. I am no longer a member of it since some time. > > > > Linux m68k should already be able to use disks in GPT format, but > > you > > likely won´t be able to read them in AmigaOS, unless there is some > > third party support for it meanwhile. > > > > Thanks, > > Martin >
Re: moving affs + RDB partition support to staging?
ion of a good idea. You'd discover that as soon as an > >> RDB uint64_t disk is tasted by a uint32_t only system. If it is > >> for your personal use then you're entirely free to reject my > >> advice and are probably smart enough to keep it working for > >> yourself. > >> > >> GPT is probably the right way to go. Preserve the ability to read > >> RDBs for legacy disks only. > >> > >> {^_^} > >> > >> On 20180626 01:31, Michael Schmitz wrote: > >>> Joanne, > >>> > >>> I think we all agree that doing 32 bit calculations on 512-byte > >>> block > >>> addresses that overflow on disks 2 TB and larger is a bug, causing > >>> the issues Martin reported. Your patch addresses that by using > >>> the correct data type for the calculations (as do other partition > >>> parsers that may have to deal with large disks) and fixes > >>> Martin's bug, so appears to be the right thing to do. > >>> > >>> Using 64 bit data types for disks smaller than 2 TB where > >>> calculations don't currently overflow is not expected to cause > >>> new issues, other than enabling use of disk and partitions larger > >>> than 2 TB (which may have ramifications with filesystems on these > >>> partitions). So comptibility is preserved. > >>> > >>> Forcing larger block sizes might be a good strategy to avoid > >>> overflow > >>> issues in filesystems as well, but I can't see how the block size > >>> stored in the RDB would enforce use of the same block size in > >>> filesystems. We'll have to rely on the filesystem tools to get > >>> that right, too. Linux AFFS does allow block sizes up to 4k (VFS > >>> limitation) so this should allow partitions larger than 2 TB to > >>> work already (but I suspect Al Viro may have found a few issues > >>> when he looked at the AFFS code so I won't say more). Anyway > >>> partitioning tools and filesystems are unrelated to the Linux > >>> partition parser code which is all we aim to fix in this patch. > >>> > >>> If you feel strongly about unknown ramifications of any > >>> filesystems on partitions larger than 2 TB, say so and I'll have > >>> the kernel print a warning about these partitions. > >>> > >>> I'll get this patch tested on Martin's test case image as well as > >>> on a RDB image from a disk known to currently work under Linux > >>> (thanks Geert for the losetup hint). Can't do much more without > >>> procuring a working Amiga disk image to use with an emulator, > >>> sorry. The Amiga I plan to use for tests is a long way away from > >>> my home indeed. > >>> > >>> Cheers, > >>> > >>> Michael > >>> > >>> Am 26.06.18 um 17:17 schrieb jdow: > >>>> As long as it preserves compatibility it should be OK, I suppose. > >>>> Personally I'd make any partitioning tool front end gently force > >>>> the > >>>> block size towards 8k as the disk size gets larger. The file > >>>> systems > >>>> may also run into 2TB issues that are not obvious. An unused > >>>> blocks > >>>> list will have to go beyond a uint32_t size, for example. But a > >>>> block > >>>> list (OFS for sure, don't remember for the newer AFS) uses a tad > >>>> under 1% of the disk all by itself. A block bitmap is not quite > >>>> so bad. {^_-} > >>>> > >>>> Just be sure you are aware of all the ramifications when you make > >>>> a > >>>> change. I remember thinking about this for awhile and then > >>>> determining I REALLY did not want to think about it as my brain > >>>> was getting tied into a gordian knot. > >>>> > >>>> {^_^} > >>>> > >>>> On 20180625 19:23, Michael Schmitz wrote: > >>>>> Joanne, > >>>>> > >>>>> Martin's boot log (including your patch) says: > >>>>> > >>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) > >>>>> sdb1 > >>>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res > >>>>> 2 spb > >>>>> 4) > >>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:
Re: moving affs + RDB partition support to staging?
ion of a good idea. You'd discover that as soon as an > >> RDB uint64_t disk is tasted by a uint32_t only system. If it is > >> for your personal use then you're entirely free to reject my > >> advice and are probably smart enough to keep it working for > >> yourself. > >> > >> GPT is probably the right way to go. Preserve the ability to read > >> RDBs for legacy disks only. > >> > >> {^_^} > >> > >> On 20180626 01:31, Michael Schmitz wrote: > >>> Joanne, > >>> > >>> I think we all agree that doing 32 bit calculations on 512-byte > >>> block > >>> addresses that overflow on disks 2 TB and larger is a bug, causing > >>> the issues Martin reported. Your patch addresses that by using > >>> the correct data type for the calculations (as do other partition > >>> parsers that may have to deal with large disks) and fixes > >>> Martin's bug, so appears to be the right thing to do. > >>> > >>> Using 64 bit data types for disks smaller than 2 TB where > >>> calculations don't currently overflow is not expected to cause > >>> new issues, other than enabling use of disk and partitions larger > >>> than 2 TB (which may have ramifications with filesystems on these > >>> partitions). So comptibility is preserved. > >>> > >>> Forcing larger block sizes might be a good strategy to avoid > >>> overflow > >>> issues in filesystems as well, but I can't see how the block size > >>> stored in the RDB would enforce use of the same block size in > >>> filesystems. We'll have to rely on the filesystem tools to get > >>> that right, too. Linux AFFS does allow block sizes up to 4k (VFS > >>> limitation) so this should allow partitions larger than 2 TB to > >>> work already (but I suspect Al Viro may have found a few issues > >>> when he looked at the AFFS code so I won't say more). Anyway > >>> partitioning tools and filesystems are unrelated to the Linux > >>> partition parser code which is all we aim to fix in this patch. > >>> > >>> If you feel strongly about unknown ramifications of any > >>> filesystems on partitions larger than 2 TB, say so and I'll have > >>> the kernel print a warning about these partitions. > >>> > >>> I'll get this patch tested on Martin's test case image as well as > >>> on a RDB image from a disk known to currently work under Linux > >>> (thanks Geert for the losetup hint). Can't do much more without > >>> procuring a working Amiga disk image to use with an emulator, > >>> sorry. The Amiga I plan to use for tests is a long way away from > >>> my home indeed. > >>> > >>> Cheers, > >>> > >>> Michael > >>> > >>> Am 26.06.18 um 17:17 schrieb jdow: > >>>> As long as it preserves compatibility it should be OK, I suppose. > >>>> Personally I'd make any partitioning tool front end gently force > >>>> the > >>>> block size towards 8k as the disk size gets larger. The file > >>>> systems > >>>> may also run into 2TB issues that are not obvious. An unused > >>>> blocks > >>>> list will have to go beyond a uint32_t size, for example. But a > >>>> block > >>>> list (OFS for sure, don't remember for the newer AFS) uses a tad > >>>> under 1% of the disk all by itself. A block bitmap is not quite > >>>> so bad. {^_-} > >>>> > >>>> Just be sure you are aware of all the ramifications when you make > >>>> a > >>>> change. I remember thinking about this for awhile and then > >>>> determining I REALLY did not want to think about it as my brain > >>>> was getting tied into a gordian knot. > >>>> > >>>> {^_^} > >>>> > >>>> On 20180625 19:23, Michael Schmitz wrote: > >>>>> Joanne, > >>>>> > >>>>> Martin's boot log (including your patch) says: > >>>>> > >>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) > >>>>> sdb1 > >>>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res > >>>>> 2 spb > >>>>> 4) > >>>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:
Re: moving affs + RDB partition support to staging?
sers that may have to deal with large disks) and fixes Martin's > >> bug, so appears to be the right thing to do. > >> > >> Using 64 bit data types for disks smaller than 2 TB where > >> calculations don't currently overflow is not expected to cause new > >> issues, other than enabling use of disk and partitions larger than > >> 2 TB (which may have ramifications with filesystems on these > >> partitions). So comptibility is preserved. > >> > >> Forcing larger block sizes might be a good strategy to avoid > >> overflow > >> issues in filesystems as well, but I can't see how the block size > >> stored in the RDB would enforce use of the same block size in > >> filesystems. We'll have to rely on the filesystem tools to get > >> that right, too. Linux AFFS does allow block sizes up to 4k (VFS > >> limitation) so this should allow partitions larger than 2 TB to > >> work already (but I suspect Al Viro may have found a few issues > >> when he looked at the AFFS code so I won't say more). Anyway > >> partitioning tools and filesystems are unrelated to the Linux > >> partition parser code which is all we aim to fix in this patch. > >> > >> If you feel strongly about unknown ramifications of any filesystems > >> on partitions larger than 2 TB, say so and I'll have the kernel > >> print a warning about these partitions. > >> > >> I'll get this patch tested on Martin's test case image as well as > >> on a RDB image from a disk known to currently work under Linux > >> (thanks Geert for the losetup hint). Can't do much more without > >> procuring a working Amiga disk image to use with an emulator, > >> sorry. The Amiga I plan to use for tests is a long way away from > >> my home indeed. > >> > >> Cheers, > >> > >> Michael > >> > >> Am 26.06.18 um 17:17 schrieb jdow: > >>> As long as it preserves compatibility it should be OK, I suppose. > >>> Personally I'd make any partitioning tool front end gently force > >>> the > >>> block size towards 8k as the disk size gets larger. The file > >>> systems > >>> may also run into 2TB issues that are not obvious. An unused > >>> blocks > >>> list will have to go beyond a uint32_t size, for example. But a > >>> block > >>> list (OFS for sure, don't remember for the newer AFS) uses a tad > >>> under 1% of the disk all by itself. A block bitmap is not quite > >>> so bad. {^_-} > >>> > >>> Just be sure you are aware of all the ramifications when you make > >>> a > >>> change. I remember thinking about this for awhile and then > >>> determining I REALLY did not want to think about it as my brain > >>> was getting tied into a gordian knot. > >>> > >>> {^_^} > >>> > >>> On 20180625 19:23, Michael Schmitz wrote: > >>>> Joanne, > >>>> > >>>> Martin's boot log (including your patch) says: > >>>> > >>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) > >>>> sdb1 > >>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 > >>>> spb > >>>> 4) > >>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb] > >>>> Attached SCSI disk > >>>> > >>>> so it's indeed a case of self inflicted damage (RDSK (512) means > >>>> 512 > >>>> byte blocks) and can be worked around by using a different block > >>>> size. > >>>> > >>>> Your memory serves right indeed - blocksize is in 512 bytes > >>>> units. > >>>> I'll still submit a patch to Jens anyway as this may bite others > >>>> yet. > >>>> > >>>> Cheers, > >>>> > >>>> Michael > >>>> > >>>> On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >>>>> BTW - anybody who uses 512 byte blocks with an Amiga file system > >>>>> is > >>>>> a famn > >>>>> dool. > >>>>> > >>>>> If memory serves the RDBs think in blocks rather than bytes so > >>>>> it > >>>>> should > >>>>> work up to 2 gigablocks whatever your block size is. 512 blocks > >>>>> is > >
Re: moving affs + RDB partition support to staging?
sers that may have to deal with large disks) and fixes Martin's > >> bug, so appears to be the right thing to do. > >> > >> Using 64 bit data types for disks smaller than 2 TB where > >> calculations don't currently overflow is not expected to cause new > >> issues, other than enabling use of disk and partitions larger than > >> 2 TB (which may have ramifications with filesystems on these > >> partitions). So comptibility is preserved. > >> > >> Forcing larger block sizes might be a good strategy to avoid > >> overflow > >> issues in filesystems as well, but I can't see how the block size > >> stored in the RDB would enforce use of the same block size in > >> filesystems. We'll have to rely on the filesystem tools to get > >> that right, too. Linux AFFS does allow block sizes up to 4k (VFS > >> limitation) so this should allow partitions larger than 2 TB to > >> work already (but I suspect Al Viro may have found a few issues > >> when he looked at the AFFS code so I won't say more). Anyway > >> partitioning tools and filesystems are unrelated to the Linux > >> partition parser code which is all we aim to fix in this patch. > >> > >> If you feel strongly about unknown ramifications of any filesystems > >> on partitions larger than 2 TB, say so and I'll have the kernel > >> print a warning about these partitions. > >> > >> I'll get this patch tested on Martin's test case image as well as > >> on a RDB image from a disk known to currently work under Linux > >> (thanks Geert for the losetup hint). Can't do much more without > >> procuring a working Amiga disk image to use with an emulator, > >> sorry. The Amiga I plan to use for tests is a long way away from > >> my home indeed. > >> > >> Cheers, > >> > >> Michael > >> > >> Am 26.06.18 um 17:17 schrieb jdow: > >>> As long as it preserves compatibility it should be OK, I suppose. > >>> Personally I'd make any partitioning tool front end gently force > >>> the > >>> block size towards 8k as the disk size gets larger. The file > >>> systems > >>> may also run into 2TB issues that are not obvious. An unused > >>> blocks > >>> list will have to go beyond a uint32_t size, for example. But a > >>> block > >>> list (OFS for sure, don't remember for the newer AFS) uses a tad > >>> under 1% of the disk all by itself. A block bitmap is not quite > >>> so bad. {^_-} > >>> > >>> Just be sure you are aware of all the ramifications when you make > >>> a > >>> change. I remember thinking about this for awhile and then > >>> determining I REALLY did not want to think about it as my brain > >>> was getting tied into a gordian knot. > >>> > >>> {^_^} > >>> > >>> On 20180625 19:23, Michael Schmitz wrote: > >>>> Joanne, > >>>> > >>>> Martin's boot log (including your patch) says: > >>>> > >>>> Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) > >>>> sdb1 > >>>> (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 > >>>> spb > >>>> 4) > >>>> Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb] > >>>> Attached SCSI disk > >>>> > >>>> so it's indeed a case of self inflicted damage (RDSK (512) means > >>>> 512 > >>>> byte blocks) and can be worked around by using a different block > >>>> size. > >>>> > >>>> Your memory serves right indeed - blocksize is in 512 bytes > >>>> units. > >>>> I'll still submit a patch to Jens anyway as this may bite others > >>>> yet. > >>>> > >>>> Cheers, > >>>> > >>>> Michael > >>>> > >>>> On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >>>>> BTW - anybody who uses 512 byte blocks with an Amiga file system > >>>>> is > >>>>> a famn > >>>>> dool. > >>>>> > >>>>> If memory serves the RDBs think in blocks rather than bytes so > >>>>> it > >>>>> should > >>>>> work up to 2 gigablocks whatever your block size is. 512 blocks > >>>>> is > >
Re: moving affs + RDB partition support to staging?
Joanne. jdow - 26.06.18, 07:17: > As long as it preserves compatibility it should be OK, I suppose. > Personally I'd make any partitioning tool front end gently force the > block size towards 8k as the disk size gets larger. The file systems > may also run into 2TB issues that are not obvious. An unused blocks > list will have to go beyond a uint32_t size, for example. But a block > list (OFS for sure, don't remember for the newer AFS) uses a tad > under 1% of the disk all by itself. A block bitmap is not quite so > bad. {^_-} > > Just be sure you are aware of all the ramifications when you make a > change. I remember thinking about this for awhile and then > determining I REALLY did not want to think about it as my brain was > getting tied into a gordian knot. Heh… :) Well, all I can say it that it just worked on AmigaOS 4. I did not see any data corruption in any of the filesystems. Well as far as I have been able to check. There has been no repair tool for JXFS I think. But as I migrated the data on it to SFS, I was able to copy everything. Famous last words. Well especially the disk size was detected properly and there was no overflow like on Linux. So I´d say rather have on error less than one error more. Of course, it could also be an option to outright *refuse* to detect such disks with a big fat warning in kernel log that it may unsafe. But overflowing and thus on writes overwriting existing data is not safe. I do think it is safe enough, but what I do know about the internals about RDB (more than the average user certainly, but not as much as you or some other AmigaOS developer who digged deeper into that). So in case you´d rather see Linux to refuse to handle disks like that, that would also be fine with me. Just do not handle them in the broken way that they are handled in Linux now. I.e. do not deliberately corrupt things as in "Its dangerous, so let´s overwrite data straight away, so the user gets it." :) Anyway, in my opinion RDB still just so much more advanced than MBR and in some parts even on par with the much later GPT. With some limitations it is a quite brilliant partition format, if you ask me. > {^_^} > > On 20180625 19:23, Michael Schmitz wrote: > > Joanne, > > > > Martin's boot log (including your patch) says: > > > > Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) sdb1 > > (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 > > spb > > 4) > > Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb] > > Attached SCSI disk > > > > so it's indeed a case of self inflicted damage (RDSK (512) means 512 > > byte blocks) and can be worked around by using a different block > > size. > > > > Your memory serves right indeed - blocksize is in 512 bytes units. > > I'll still submit a patch to Jens anyway as this may bite others > > yet. > > > > Cheers, > > > >Michael > > > > On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >> BTW - anybody who uses 512 byte blocks with an Amiga file system is > >> a famn dool. > >> > >> If memory serves the RDBs think in blocks rather than bytes so it > >> should work up to 2 gigablocks whatever your block size is. 512 > >> blocks is 219902322 bytes. But that wastes just a WHOLE LOT of > >> disk in block maps. Go up to 4096 or 8192. The latter is 35 TB. > >> > >> {^_^} > >> > >> On 20180624 02:06, Martin Steigerwald wrote: > >>> Hi. > >>> > >>> Michael Schmitz - 27.04.18, 04:11: > >>>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >>>> indicate the RDB parser bug is fixed by the patch given there, so > >>>> if > >>>> Martin now submits the patch, all should be well? > >>> > >>> Ok, better be honest than having anyone waiting for it: > >>> > >>> I do not care enough about this, in order to motivate myself > >>> preparing the a patch from Joanne Dow´s fix. > >>> > >>> I am not even using my Amiga boxes anymore, not even the Sam440ep > >>> which I still have in my apartment. > >>> > >>> So RDB support in Linux it remains broken for disks larger 2 TB, > >>> unless someone else does. > >>> > >>> Thanks. […] -- Martin
Re: moving affs + RDB partition support to staging?
Joanne. jdow - 26.06.18, 07:17: > As long as it preserves compatibility it should be OK, I suppose. > Personally I'd make any partitioning tool front end gently force the > block size towards 8k as the disk size gets larger. The file systems > may also run into 2TB issues that are not obvious. An unused blocks > list will have to go beyond a uint32_t size, for example. But a block > list (OFS for sure, don't remember for the newer AFS) uses a tad > under 1% of the disk all by itself. A block bitmap is not quite so > bad. {^_-} > > Just be sure you are aware of all the ramifications when you make a > change. I remember thinking about this for awhile and then > determining I REALLY did not want to think about it as my brain was > getting tied into a gordian knot. Heh… :) Well, all I can say it that it just worked on AmigaOS 4. I did not see any data corruption in any of the filesystems. Well as far as I have been able to check. There has been no repair tool for JXFS I think. But as I migrated the data on it to SFS, I was able to copy everything. Famous last words. Well especially the disk size was detected properly and there was no overflow like on Linux. So I´d say rather have on error less than one error more. Of course, it could also be an option to outright *refuse* to detect such disks with a big fat warning in kernel log that it may unsafe. But overflowing and thus on writes overwriting existing data is not safe. I do think it is safe enough, but what I do know about the internals about RDB (more than the average user certainly, but not as much as you or some other AmigaOS developer who digged deeper into that). So in case you´d rather see Linux to refuse to handle disks like that, that would also be fine with me. Just do not handle them in the broken way that they are handled in Linux now. I.e. do not deliberately corrupt things as in "Its dangerous, so let´s overwrite data straight away, so the user gets it." :) Anyway, in my opinion RDB still just so much more advanced than MBR and in some parts even on par with the much later GPT. With some limitations it is a quite brilliant partition format, if you ask me. > {^_^} > > On 20180625 19:23, Michael Schmitz wrote: > > Joanne, > > > > Martin's boot log (including your patch) says: > > > > Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) sdb1 > > (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 > > spb > > 4) > > Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb] > > Attached SCSI disk > > > > so it's indeed a case of self inflicted damage (RDSK (512) means 512 > > byte blocks) and can be worked around by using a different block > > size. > > > > Your memory serves right indeed - blocksize is in 512 bytes units. > > I'll still submit a patch to Jens anyway as this may bite others > > yet. > > > > Cheers, > > > >Michael > > > > On Sun, Jun 24, 2018 at 11:40 PM, jdow wrote: > >> BTW - anybody who uses 512 byte blocks with an Amiga file system is > >> a famn dool. > >> > >> If memory serves the RDBs think in blocks rather than bytes so it > >> should work up to 2 gigablocks whatever your block size is. 512 > >> blocks is 219902322 bytes. But that wastes just a WHOLE LOT of > >> disk in block maps. Go up to 4096 or 8192. The latter is 35 TB. > >> > >> {^_^} > >> > >> On 20180624 02:06, Martin Steigerwald wrote: > >>> Hi. > >>> > >>> Michael Schmitz - 27.04.18, 04:11: > >>>> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >>>> indicate the RDB parser bug is fixed by the patch given there, so > >>>> if > >>>> Martin now submits the patch, all should be well? > >>> > >>> Ok, better be honest than having anyone waiting for it: > >>> > >>> I do not care enough about this, in order to motivate myself > >>> preparing the a patch from Joanne Dow´s fix. > >>> > >>> I am not even using my Amiga boxes anymore, not even the Sam440ep > >>> which I still have in my apartment. > >>> > >>> So RDB support in Linux it remains broken for disks larger 2 TB, > >>> unless someone else does. > >>> > >>> Thanks. […] -- Martin
Re: moving affs + RDB partition support to staging?
Michael. Michael Schmitz - 26.06.18, 04:23: > Joanne, > > Martin's boot log (including your patch) says: > > Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) sdb1 > (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb > 4) > Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb] > Attached SCSI disk > > so it's indeed a case of self inflicted damage (RDSK (512) means 512 > byte blocks) and can be worked around by using a different block size. Well I pretty much believe that this was the standard block size of the tool I used in order to create the RDB. I think it has been Media Toolbox + the engine behind it, from AmigaOS 4.0 (not 4.1) or so. DOS- Type JXF points to JXFS, an AmigaOS 4 only filesystem that meanwhile has been deprecated by AmigaOS upstream. Maybe HDToolbox + hdwrench.library by Joanne (AmigaOS 3.5/3.9) would have set it up differently. Anyway: It works like this in AmigaOS 4 without any issues. With 512 Byte Blocks. I think it is good that Linux does not create data corruption when writing to disks that work just fine in AmigaOS. Especially as those using AmigaNG machines like AmigaOne X1000/X5000 or Acube Sam machines may dual boot AmigaOS and Linux on their machines. Thanks again for putting together a patch. Thanks, -- Martin
Re: moving affs + RDB partition support to staging?
Michael. Michael Schmitz - 26.06.18, 04:23: > Joanne, > > Martin's boot log (including your patch) says: > > Jun 19 21:19:09 merkaba kernel: [ 7891.843284] sdb: RDSK (512) sdb1 > (LNX^@)(res 2 spb 1) sdb2 (JXF^D)(res 2 spb 1) sdb3 (DOS^C)(res 2 spb > 4) > Jun 19 21:19:09 merkaba kernel: [ 7891.844055] sd 7:0:0:0: [sdb] > Attached SCSI disk > > so it's indeed a case of self inflicted damage (RDSK (512) means 512 > byte blocks) and can be worked around by using a different block size. Well I pretty much believe that this was the standard block size of the tool I used in order to create the RDB. I think it has been Media Toolbox + the engine behind it, from AmigaOS 4.0 (not 4.1) or so. DOS- Type JXF points to JXFS, an AmigaOS 4 only filesystem that meanwhile has been deprecated by AmigaOS upstream. Maybe HDToolbox + hdwrench.library by Joanne (AmigaOS 3.5/3.9) would have set it up differently. Anyway: It works like this in AmigaOS 4 without any issues. With 512 Byte Blocks. I think it is good that Linux does not create data corruption when writing to disks that work just fine in AmigaOS. Especially as those using AmigaNG machines like AmigaOne X1000/X5000 or Acube Sam machines may dual boot AmigaOS and Linux on their machines. Thanks again for putting together a patch. Thanks, -- Martin
Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Hi Michael. Michael Schmitz - 25.06.18, 09:53: > OK, I'll prepare a patch and submit it to linux-block for review. I'll > have to refer to your testing back in 2012 since all I can test is > whether the patch still allows partition tables on small disks to be > recognized at this time (unless Adrian has a 2 TB disk and a > SATA-SCSI bridge to test this properly on). Thank you very much. I believe the testing I did is still valid. Feel free to include any parts of the description of the test I made back then into your patch description as you see it as relevant for it. Also feel free to include my Tested-By: (probably with a hint the test was in 2012). I am not sure I am ready to permanently let go of (some of) the hardware I still have collected, but at some time I may. I do have SATA-SCSI bridges. That A4000T I have here probably could be a nice Debian m68k build host. Thanks, Martin > > Cheers, > > Michael > > Am 24.06.18 um 21:06 schrieb Martin Steigerwald: > > Hi. > > > > Michael Schmitz - 27.04.18, 04:11: > >> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >> indicate the RDB parser bug is fixed by the patch given there, so > >> if > >> Martin now submits the patch, all should be well? > > > > Ok, better be honest than having anyone waiting for it: > > > > I do not care enough about this, in order to motivate myself > > preparing the a patch from Joanne Dow´s fix. > > > > I am not even using my Amiga boxes anymore, not even the Sam440ep > > which I still have in my apartment. > > > > So RDB support in Linux it remains broken for disks larger 2 TB, > > unless someone else does. > > > > Thanks. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-m68k" > in the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Martin
Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Hi Michael. Michael Schmitz - 25.06.18, 09:53: > OK, I'll prepare a patch and submit it to linux-block for review. I'll > have to refer to your testing back in 2012 since all I can test is > whether the patch still allows partition tables on small disks to be > recognized at this time (unless Adrian has a 2 TB disk and a > SATA-SCSI bridge to test this properly on). Thank you very much. I believe the testing I did is still valid. Feel free to include any parts of the description of the test I made back then into your patch description as you see it as relevant for it. Also feel free to include my Tested-By: (probably with a hint the test was in 2012). I am not sure I am ready to permanently let go of (some of) the hardware I still have collected, but at some time I may. I do have SATA-SCSI bridges. That A4000T I have here probably could be a nice Debian m68k build host. Thanks, Martin > > Cheers, > > Michael > > Am 24.06.18 um 21:06 schrieb Martin Steigerwald: > > Hi. > > > > Michael Schmitz - 27.04.18, 04:11: > >> test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511 > >> indicate the RDB parser bug is fixed by the patch given there, so > >> if > >> Martin now submits the patch, all should be well? > > > > Ok, better be honest than having anyone waiting for it: > > > > I do not care enough about this, in order to motivate myself > > preparing the a patch from Joanne Dow´s fix. > > > > I am not even using my Amiga boxes anymore, not even the Sam440ep > > which I still have in my apartment. > > > > So RDB support in Linux it remains broken for disks larger 2 TB, > > unless someone else does. > > > > Thanks. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-m68k" > in the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Martin
Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Hi. Michael Schmitz - 27.04.18, 04:11: > test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511 > indicate the RDB parser bug is fixed by the patch given there, so if > Martin now submits the patch, all should be well? Ok, better be honest than having anyone waiting for it: I do not care enough about this, in order to motivate myself preparing the a patch from Joanne Dow´s fix. I am not even using my Amiga boxes anymore, not even the Sam440ep which I still have in my apartment. So RDB support in Linux it remains broken for disks larger 2 TB, unless someone else does. Thanks. -- Martin
Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Hi. Michael Schmitz - 27.04.18, 04:11: > test results at https://bugzilla.kernel.org/show_bug.cgi?id=43511 > indicate the RDB parser bug is fixed by the patch given there, so if > Martin now submits the patch, all should be well? Ok, better be honest than having anyone waiting for it: I do not care enough about this, in order to motivate myself preparing the a patch from Joanne Dow´s fix. I am not even using my Amiga boxes anymore, not even the Sam440ep which I still have in my apartment. So RDB support in Linux it remains broken for disks larger 2 TB, unless someone else does. Thanks. -- Martin
Linux on Intel x86: How to disable Hyperthreading
Cc´d people for X86 architecture in MAINTAINERS. Hi! According to https://www.kuketz-blog.de/tlbleed-neue-sicherheitsluecken-bei-intel-cpus/ in https://www.blackhat.com/us-18/briefings/schedule/#tlbleed-when-protecting-your-cpu-caches-is-not-enough-10149 (Firefox and Chromium with current settings and add-ons webbrowsers do not display the contents of the page) one can read: > Our TLBleed exploit successfully leaks a 256-bit EdDSA key from > libgcrypt (used in e.g. GPG) with a 98% success rate after just a > single observation of signing operation on a co-resident hyperthread > and just 17 seconds of analysis time. And this probably would explain the disabling of it in OpenBSD. What is your take on this? Also how would you disable it in case you would? I currently test run with the script I found on: https://serverfault.com/questions/235825/disable-hyperthreading-from-within-linux-no-access-to-bios Just to see whether I notice a significant performance drop. Thanks. -- Martin
Linux on Intel x86: How to disable Hyperthreading
Cc´d people for X86 architecture in MAINTAINERS. Hi! According to https://www.kuketz-blog.de/tlbleed-neue-sicherheitsluecken-bei-intel-cpus/ in https://www.blackhat.com/us-18/briefings/schedule/#tlbleed-when-protecting-your-cpu-caches-is-not-enough-10149 (Firefox and Chromium with current settings and add-ons webbrowsers do not display the contents of the page) one can read: > Our TLBleed exploit successfully leaks a 256-bit EdDSA key from > libgcrypt (used in e.g. GPG) with a 98% success rate after just a > single observation of signing operation on a co-resident hyperthread > and just 17 seconds of analysis time. And this probably would explain the disabling of it in OpenBSD. What is your take on this? Also how would you disable it in case you would? I currently test run with the script I found on: https://serverfault.com/questions/235825/disable-hyperthreading-from-within-linux-no-access-to-bios Just to see whether I notice a significant performance drop. Thanks. -- Martin
Re: moving affs + RDB partition support to staging?
Michael Schmitz - 07.05.18, 04:40: > Al, > > I don't think there is USB sticks with affs on them as yet. There > isn't even USB host controller support for Amiga hardware (yet). > > Last I tried USB on m68k (Atari, 060 accelerator) the desktop > experience was such that I'd rather not repeat that in a hurry (and > that was a simple FAT USB stick). There is USB support available on Amiga since a long time. On "Classic" Amigas AmigaOS 3.x with Poseidon USB stack + some USB card. On AmigaOS 4.x built-in. AmigaOS 4.x hardware like Sam boards from Acube Systems have USB controllers that work out of the bux. And I am pretty sure, you can also tell it to use Amiga Fast Filesystem (on Linux affs) on an USB stick. Also you can plug in an external harddisk with RDB partitions and whatever filesystems you wish. Thanks, -- Martin
Re: moving affs + RDB partition support to staging?
Michael Schmitz - 07.05.18, 04:40: > Al, > > I don't think there is USB sticks with affs on them as yet. There > isn't even USB host controller support for Amiga hardware (yet). > > Last I tried USB on m68k (Atari, 060 accelerator) the desktop > experience was such that I'd rather not repeat that in a hurry (and > that was a simple FAT USB stick). There is USB support available on Amiga since a long time. On "Classic" Amigas AmigaOS 3.x with Poseidon USB stack + some USB card. On AmigaOS 4.x built-in. AmigaOS 4.x hardware like Sam boards from Acube Systems have USB controllers that work out of the bux. And I am pretty sure, you can also tell it to use Amiga Fast Filesystem (on Linux affs) on an USB stick. Also you can plug in an external harddisk with RDB partitions and whatever filesystems you wish. Thanks, -- Martin
Re: moving affs + RDB partition support to staging?
Al Viro - 06.05.18, 02:59: > On Thu, Apr 26, 2018 at 12:45:41PM +0200, John Paul Adrian Glaubitz wrote: > > Exactly. It works fine as is: > > > > root@elgar:~> uname -a > > Linux elgar 4.16.0-rc2-amiga-16784-ga8917fc #650 Mon Mar 5 15:32:52 > > NZDT 2018 m68k GNU/Linux root@elgar:~> mount /dev/sda1 /mnt -taffs > > root@elgar:~> ls -l /mnt | head > > total 0 > > drwx-- 1 root root 0 Mar 30 2001 Alt > > -rw--- 1 root root 1352 Mar 27 1997 Alt.info > > drwx-- 1 root root 0 Nov 16 14:39 C > > drwx-- 1 root root 0 Mar 27 1997 CS_Fonts > > drwx-- 1 root root 0 Mar 27 1997 Classes > > -rwx-- 1 root root 1132 Aug 14 1996 Classes.info > > drwx-- 1 root root 0 Feb 10 2004 Commodities > > -rw--- 1 root root628 Jan 14 2002 Commodities.info > > drwx-- 1 root root 0 Apr 10 1999 CyberTools > > root@elgar:~> mount |grep affs > > /dev/sda1 on /mnt type affs (rw,relatime,bs=512,volume=:) > > root@elgar:~> > > > > There is nothing at the moment that needs fixing. > > Funny, that... I'd been going through the damn thing for the > last week or so; open-by-fhandle/nfs export support is completely > buggered. And as for the rest... the least said about the error > handling, the better - something like rename() hitting an IO > error (read one) can not only screw the on-disk data into the > ground, it can do seriously bad things to kernel data structures. > > Is there anything resembling fsck for that thing, BTW? Nevermind > the repairs, just the validity checks would be nice... I am not aware of the fsck command for affs on Linux. There is a partitioning tool called amiga-fdisk, but for checking a filesystem you would need to use a tool under AmigaOS. -- Martin
Re: moving affs + RDB partition support to staging?
Al Viro - 06.05.18, 02:59: > On Thu, Apr 26, 2018 at 12:45:41PM +0200, John Paul Adrian Glaubitz wrote: > > Exactly. It works fine as is: > > > > root@elgar:~> uname -a > > Linux elgar 4.16.0-rc2-amiga-16784-ga8917fc #650 Mon Mar 5 15:32:52 > > NZDT 2018 m68k GNU/Linux root@elgar:~> mount /dev/sda1 /mnt -taffs > > root@elgar:~> ls -l /mnt | head > > total 0 > > drwx-- 1 root root 0 Mar 30 2001 Alt > > -rw--- 1 root root 1352 Mar 27 1997 Alt.info > > drwx-- 1 root root 0 Nov 16 14:39 C > > drwx-- 1 root root 0 Mar 27 1997 CS_Fonts > > drwx-- 1 root root 0 Mar 27 1997 Classes > > -rwx-- 1 root root 1132 Aug 14 1996 Classes.info > > drwx-- 1 root root 0 Feb 10 2004 Commodities > > -rw--- 1 root root628 Jan 14 2002 Commodities.info > > drwx-- 1 root root 0 Apr 10 1999 CyberTools > > root@elgar:~> mount |grep affs > > /dev/sda1 on /mnt type affs (rw,relatime,bs=512,volume=:) > > root@elgar:~> > > > > There is nothing at the moment that needs fixing. > > Funny, that... I'd been going through the damn thing for the > last week or so; open-by-fhandle/nfs export support is completely > buggered. And as for the rest... the least said about the error > handling, the better - something like rename() hitting an IO > error (read one) can not only screw the on-disk data into the > ground, it can do seriously bad things to kernel data structures. > > Is there anything resembling fsck for that thing, BTW? Nevermind > the repairs, just the validity checks would be nice... I am not aware of the fsck command for affs on Linux. There is a partitioning tool called amiga-fdisk, but for checking a filesystem you would need to use a tool under AmigaOS. -- Martin
Re: moving affs + RDB partition support to staging?
John Paul Adrian Glaubitz - 06.05.18, 10:52: > On 04/27/2018 03:26 AM, jdow wrote: > > And before I forget there are two features of the RDBs that I > > heartily recommend never implementing on Linux. They were good > > ideas at the time; but, times changed. The RDBs are capable of > > storing a filesystem driver and some drive init code for the plugin > > disk driver card. That is giving malware authors entirely goo easy > > a shot at owning a machine. Martin S., I would strongly suggest > > that going forward those two capabilities be removed from the RDB > > readers in AmigaOS as well as Linux OS. > > I assume removing the feature for AmigaOS isn't really possible since > we don't have the source code for that, do we? AmigaOS 4.x does not support loading filesystems from RDB anymore as far as I know. Meanwhile I am not involved with the AmigaOS development anymore, but that is the last state I know. Similarily to Linux filesystems drivers are loaded as "modules" into the kernel. Its also still possible to load a filesystem as a file from disk, but that does not work for the filesystem the kernel boots from. The AmigaOS kernel still decides which of the kernel filesystem to use according to the DOSType of the partition. Thanks, -- Martin
Re: moving affs + RDB partition support to staging?
John Paul Adrian Glaubitz - 06.05.18, 10:52: > On 04/27/2018 03:26 AM, jdow wrote: > > And before I forget there are two features of the RDBs that I > > heartily recommend never implementing on Linux. They were good > > ideas at the time; but, times changed. The RDBs are capable of > > storing a filesystem driver and some drive init code for the plugin > > disk driver card. That is giving malware authors entirely goo easy > > a shot at owning a machine. Martin S., I would strongly suggest > > that going forward those two capabilities be removed from the RDB > > readers in AmigaOS as well as Linux OS. > > I assume removing the feature for AmigaOS isn't really possible since > we don't have the source code for that, do we? AmigaOS 4.x does not support loading filesystems from RDB anymore as far as I know. Meanwhile I am not involved with the AmigaOS development anymore, but that is the last state I know. Similarily to Linux filesystems drivers are loaded as "modules" into the kernel. Its also still possible to load a filesystem as a file from disk, but that does not work for the filesystem the kernel boots from. The AmigaOS kernel still decides which of the kernel filesystem to use according to the DOSType of the partition. Thanks, -- Martin
Spectre V2: Eight new security holes in Intel processors
Hello. It seems there are eight new security holes alongside the Spectre/ Meltdown CPU design issues: https://www.heise.de/security/meldung/Spectre-NG-Intel-Prozessoren-von-neuen-hochriskanten-Sicherheitsluecken-betroffen-4039302.html (german language only, only found german language reports refering to the Heise c´t article so far, I did not find any other publically viewable source on this so far) Short summary: - eight new security issues found by various research teams (including Google Project Zero) - GPZ may release one of them at 7th of May after 90 days embargo - Intel considers four of them to be critical - Article authors and editors at Heise consider one to be highly critical. They claim it makes it very easy to circumvent boundaries between different virtual machines or a virtual machine and hypervisor system. I got the impression that the article lacks a lot of details however. They even mention that they are not sharing them yet, in the hope patches will be there before the issues will be disclosed in full. I did not see any patches regarding these new issues on LKML, but they may run under different names. Has the Linux kernel community been informed at all? Well hopefully at least kernel developers working at Intel are working on patches. Thanks, -- Martin
Spectre V2: Eight new security holes in Intel processors
Hello. It seems there are eight new security holes alongside the Spectre/ Meltdown CPU design issues: https://www.heise.de/security/meldung/Spectre-NG-Intel-Prozessoren-von-neuen-hochriskanten-Sicherheitsluecken-betroffen-4039302.html (german language only, only found german language reports refering to the Heise c´t article so far, I did not find any other publically viewable source on this so far) Short summary: - eight new security issues found by various research teams (including Google Project Zero) - GPZ may release one of them at 7th of May after 90 days embargo - Intel considers four of them to be critical - Article authors and editors at Heise consider one to be highly critical. They claim it makes it very easy to circumvent boundaries between different virtual machines or a virtual machine and hypervisor system. I got the impression that the article lacks a lot of details however. They even mention that they are not sharing them yet, in the hope patches will be there before the issues will be disclosed in full. I did not see any patches regarding these new issues on LKML, but they may run under different names. Has the Linux kernel community been informed at all? Well hopefully at least kernel developers working at Intel are working on patches. Thanks, -- Martin
Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Geert Uytterhoeven - 26.04.18, 13:08: > On Thu, Apr 26, 2018 at 12:28 PM, Martin Steigerwald > > <mar...@lichtvoll.de> wrote: > > You probably put your stick into a cave with ancient sleeping > > dragons > > > > Added in linux-m68k mailing list, as they likely have an opinion on > > how to treat affs + RDB partition support. Also added in Jens Axboe > > about patching that RDB support broken with 2 TB or larger > > harddisks issue which had been in Linux kernel for 6 years while a > > patch exists that to my testing back then solves the issue. […] > If there are bugs in the RDB parser that people run into, they should > be fixed. > If there are limitations in the RDB format on large disks, that's > still not a reason to move it to staging (hi msdos partitioning!). What I ran into was *not* a limitation in the RDB format, but a bug in the Linux implementation on it. After Joanne Dow´s change the 2 TB disk was detected and handled properly by Linux. Also AmigaOS 4.x handles those disks just well and I think also AmigaOS 3.1/3.5/3.9 supported them, but I am not sure on the details on that, it has been a long time since I last booted one of my Amiga systems. Many classic Amiga users may not deal with such large disks, but AmigaOS 4 users probably still, and some of them may run Linux on their PowerPC motherboards as well. So I think the issue is worth fixing and am looking into submitting the fix, which looks pretty straight-forward to me upstream unless someone bets me to it. Thanks, -- Martin
Re: moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Geert Uytterhoeven - 26.04.18, 13:08: > On Thu, Apr 26, 2018 at 12:28 PM, Martin Steigerwald > > wrote: > > You probably put your stick into a cave with ancient sleeping > > dragons > > > > Added in linux-m68k mailing list, as they likely have an opinion on > > how to treat affs + RDB partition support. Also added in Jens Axboe > > about patching that RDB support broken with 2 TB or larger > > harddisks issue which had been in Linux kernel for 6 years while a > > patch exists that to my testing back then solves the issue. […] > If there are bugs in the RDB parser that people run into, they should > be fixed. > If there are limitations in the RDB format on large disks, that's > still not a reason to move it to staging (hi msdos partitioning!). What I ran into was *not* a limitation in the RDB format, but a bug in the Linux implementation on it. After Joanne Dow´s change the 2 TB disk was detected and handled properly by Linux. Also AmigaOS 4.x handles those disks just well and I think also AmigaOS 3.1/3.5/3.9 supported them, but I am not sure on the details on that, it has been a long time since I last booted one of my Amiga systems. Many classic Amiga users may not deal with such large disks, but AmigaOS 4 users probably still, and some of them may run Linux on their PowerPC motherboards as well. So I think the issue is worth fixing and am looking into submitting the fix, which looks pretty straight-forward to me upstream unless someone bets me to it. Thanks, -- Martin
Re: Moving unmaintained filesystems to staging
Pavel Machek - 26.04.18, 08:11: > On Wed 2018-04-25 08:46:02, Matthew Wilcox wrote: > > Recently ncpfs got moved to staging. Also recently, we had some > > fuzzer developers report bugs in hfs, which they deem a security > > hole because Ubuntu attempts to automount an inserted USB device as > > hfs. > > We promise "no-regressions" for code in main repository, no such > promise for staging. We have quite a lot of code without maintainer. > > Moving code to staging means it will get broken -- staging was not > designed for this. I believe moving anything there is bad idea. > > Staging is for ugly code, not for code that needs new maintainter. Good point. Moving things in and out to some "unmaintained" directory… I am not sure about that either. I tend to think that moving code around does not solve the underlying issue. Which, according to what I got from Matthew, was that distributors enable just about every filesystem they can enable which lead to hfs being used for automounting an USB stick formatted with it. In the end what may be beneficial would be hinting distributors and people who compile their own kernels at what features are considered stable, secure and well tested and what features are not, but how to determine that? The hint could be some kernel option flag that would be displayed by make *config. But then it probably needs also a justification or at least a link to more information. Actually did ever someone audit the whole kernel source? Thanks, -- Martin
Re: Moving unmaintained filesystems to staging
Pavel Machek - 26.04.18, 08:11: > On Wed 2018-04-25 08:46:02, Matthew Wilcox wrote: > > Recently ncpfs got moved to staging. Also recently, we had some > > fuzzer developers report bugs in hfs, which they deem a security > > hole because Ubuntu attempts to automount an inserted USB device as > > hfs. > > We promise "no-regressions" for code in main repository, no such > promise for staging. We have quite a lot of code without maintainer. > > Moving code to staging means it will get broken -- staging was not > designed for this. I believe moving anything there is bad idea. > > Staging is for ugly code, not for code that needs new maintainter. Good point. Moving things in and out to some "unmaintained" directory… I am not sure about that either. I tend to think that moving code around does not solve the underlying issue. Which, according to what I got from Matthew, was that distributors enable just about every filesystem they can enable which lead to hfs being used for automounting an USB stick formatted with it. In the end what may be beneficial would be hinting distributors and people who compile their own kernels at what features are considered stable, secure and well tested and what features are not, but how to determine that? The hint could be some kernel option flag that would be displayed by make *config. But then it probably needs also a justification or at least a link to more information. Actually did ever someone audit the whole kernel source? Thanks, -- Martin
moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Hi Matthew. You probably put your stick into a cave with ancient sleeping dragons :) Added in linux-m68k mailing list, as they likely have an opinion on how to treat affs + RDB partition support. Also added in Jens Axboe about patching that RDB support broken with 2 TB or larger harddisks issue which had been in Linux kernel for 6 years while a patch exists that to my testing back then solves the issue. Matthew Wilcox - 26.04.18, 04:57: > On Wed, Apr 25, 2018 at 10:30:29PM +0200, David Sterba wrote: > > I had similar toughts some time ago while browsing the fs/ > > directory. > > Access to the filesystem images can be reimplemented in FUSE, but > > other than that, I don't think the in-kernel code would be missed. > > > > It's hard to know how many users are there. I was curious eg. about > > bfs, befs, coda or feevxfs, looked at the last commits and searched > > around web if there are any mentions or user community. But as long > > as there's somebody listed in MAINTAINERS, the above are not > > candidates for moving to staging or deletion. > > Yeah, it's pretty sad how few commits some of these filesystems have > had in recent years. One can argue that they're stable and don't need > to be fixed because they aren't broken, but I find it hard to believe > that any of them were better-implemented than ext2 which still sees > regular bugfixes. Regarding affs there is a severe issue which is not in affs itself but in the handling of Rigid Disk Block (RDB) partitions, the Amiga partitioning standard, which is far more advanced than MBR: It overruns for 2 TB or larger drives and then wraps over to the beginning of the drive – I bet you can imagine what happens if you write to an area larger than 2 TB. I learned this with an external 2TB RDB partitioned harddisk back then, which I used for Sam440ep (a kind of successor for old, classic Amiga hardware) backup + some Linux related stuff in another partition. Joanne Dow, a developer who developed hdwrench.library which HDToolBox uses for partitioning in AmigaOS 3.5/3.9, provided a patch back then, but never officially put it officially through upstreaming as I offered to make a good description and upstream it through Jens Axboe. I may take this as a reason to… actually follow through this time, hopefully remembering all the details in order to provide a meaningful patch description – but I think mostly I can do just careful copy and paste. Even tough I believe Joanne Dow´s fix only fixed my bug report 43511, but not 43511 which is more about a safeguarding issue in case of future overflows, I still think it would be good to go in in case affs + RDB stays in their current places. However, in case you move affs to staging, I may be less motivated to do so, but then I suggest you also move RDB partitioning support to staging, cause this is the one that is known to be dangerously badly for 2 TB or larger disks. And yeah, I admit I did not follow through with having that patch upstreamed. Probably I did not want to be responsible in case my description would not have been absolutely accurate or the patch breaks something else. I do not have that 2 TB drive anymore and don´t feel like setting one up in a suitable way in order to go about this patch, but my testing back then was quite elaborate and I still feel pretty confident about it. I totally get your motivation, but I would find it somewhat sad to see the filesystems you mentioned go into staging. However, as I just shown clearly, for the user it may be better, cause there may be unfixed dangerous bugs. FUSE might be an interesting approach, but I bet it will not solve the maintenance issue. If there is no one maintaining it in the kernel, I think its unlikely to find someone adapting it to be a FUSE filesystem and maintaining it. And then I am not aware of FUSE based partitioning support. (And I think think ideally we´d had a microkernel and run all filesystems in userspace processes with a defined set of privileges, but that is simply not Linux as it is.) Partitions: Amiga RDB partition on 2 TB disk way too big, while OK in AmigaOS 4.1 https://lkml.org/lkml/2012/6/17/6 Bug 43521 - Amiga RDB partitions: truncates miscalculated partition size instead of refusing to use it https://bugzilla.kernel.org/show_bug.cgi?id=43521 Bug 43511 - Partitions: Amiga RDB partition on 2 TB disk way too big, while OK in AmigaOS 4.1 https://bugzilla.kernel.org/show_bug.cgi?id=43511 I forward the relevant mail of Joanne, in https://bugzilla.kernel.org/show_bug.cgi?id=43511#c7 I even have the patch in diff format. And I just checked, the issue is still unpatched as of 4.16.3. -- Weitergeleitete Nachricht -- Betreff: Re: Partitions: Amiga RDB partition on 2 TB disk way too big, while OK in AmigaOS 4.1 Datum: Montag, 18. Juni 2012, 03:28:48 CEST Von: jdow <j...@earth
moving affs + RDB partition support to staging? (was: Re: Moving unmaintained filesystems to staging)
Hi Matthew. You probably put your stick into a cave with ancient sleeping dragons :) Added in linux-m68k mailing list, as they likely have an opinion on how to treat affs + RDB partition support. Also added in Jens Axboe about patching that RDB support broken with 2 TB or larger harddisks issue which had been in Linux kernel for 6 years while a patch exists that to my testing back then solves the issue. Matthew Wilcox - 26.04.18, 04:57: > On Wed, Apr 25, 2018 at 10:30:29PM +0200, David Sterba wrote: > > I had similar toughts some time ago while browsing the fs/ > > directory. > > Access to the filesystem images can be reimplemented in FUSE, but > > other than that, I don't think the in-kernel code would be missed. > > > > It's hard to know how many users are there. I was curious eg. about > > bfs, befs, coda or feevxfs, looked at the last commits and searched > > around web if there are any mentions or user community. But as long > > as there's somebody listed in MAINTAINERS, the above are not > > candidates for moving to staging or deletion. > > Yeah, it's pretty sad how few commits some of these filesystems have > had in recent years. One can argue that they're stable and don't need > to be fixed because they aren't broken, but I find it hard to believe > that any of them were better-implemented than ext2 which still sees > regular bugfixes. Regarding affs there is a severe issue which is not in affs itself but in the handling of Rigid Disk Block (RDB) partitions, the Amiga partitioning standard, which is far more advanced than MBR: It overruns for 2 TB or larger drives and then wraps over to the beginning of the drive – I bet you can imagine what happens if you write to an area larger than 2 TB. I learned this with an external 2TB RDB partitioned harddisk back then, which I used for Sam440ep (a kind of successor for old, classic Amiga hardware) backup + some Linux related stuff in another partition. Joanne Dow, a developer who developed hdwrench.library which HDToolBox uses for partitioning in AmigaOS 3.5/3.9, provided a patch back then, but never officially put it officially through upstreaming as I offered to make a good description and upstream it through Jens Axboe. I may take this as a reason to… actually follow through this time, hopefully remembering all the details in order to provide a meaningful patch description – but I think mostly I can do just careful copy and paste. Even tough I believe Joanne Dow´s fix only fixed my bug report 43511, but not 43511 which is more about a safeguarding issue in case of future overflows, I still think it would be good to go in in case affs + RDB stays in their current places. However, in case you move affs to staging, I may be less motivated to do so, but then I suggest you also move RDB partitioning support to staging, cause this is the one that is known to be dangerously badly for 2 TB or larger disks. And yeah, I admit I did not follow through with having that patch upstreamed. Probably I did not want to be responsible in case my description would not have been absolutely accurate or the patch breaks something else. I do not have that 2 TB drive anymore and don´t feel like setting one up in a suitable way in order to go about this patch, but my testing back then was quite elaborate and I still feel pretty confident about it. I totally get your motivation, but I would find it somewhat sad to see the filesystems you mentioned go into staging. However, as I just shown clearly, for the user it may be better, cause there may be unfixed dangerous bugs. FUSE might be an interesting approach, but I bet it will not solve the maintenance issue. If there is no one maintaining it in the kernel, I think its unlikely to find someone adapting it to be a FUSE filesystem and maintaining it. And then I am not aware of FUSE based partitioning support. (And I think think ideally we´d had a microkernel and run all filesystems in userspace processes with a defined set of privileges, but that is simply not Linux as it is.) Partitions: Amiga RDB partition on 2 TB disk way too big, while OK in AmigaOS 4.1 https://lkml.org/lkml/2012/6/17/6 Bug 43521 - Amiga RDB partitions: truncates miscalculated partition size instead of refusing to use it https://bugzilla.kernel.org/show_bug.cgi?id=43521 Bug 43511 - Partitions: Amiga RDB partition on 2 TB disk way too big, while OK in AmigaOS 4.1 https://bugzilla.kernel.org/show_bug.cgi?id=43511 I forward the relevant mail of Joanne, in https://bugzilla.kernel.org/show_bug.cgi?id=43511#c7 I even have the patch in diff format. And I just checked, the issue is still unpatched as of 4.16.3. -- Weitergeleitete Nachricht -- Betreff: Re: Partitions: Amiga RDB partition on 2 TB disk way too big, while OK in AmigaOS 4.1 Datum: Montag, 18. Juni 2012, 03:28:48 CEST Von: jdow An: Martin Steigerwa
Re: [PATCH] blk-mq: start request gstate with gen 1
Hi Jianchao. jianchao.wang - 17.04.18, 16:34: > On 04/17/2018 08:10 PM, Martin Steigerwald wrote: > > For testing it I add it to 4.16.2 with the patches I have already? > > You could try to only apply this patch to have a test. :) I tested 4.16.3 with just your patch (+ the unrelated btrfs trimming fix I carry for a long time already) and it did at least 15 boots successfully (without hanging). So far also no "error loading smart data mail", but it takes a few days with suspend/hibernation + resume cycles in order to know for sure. Thanks, -- Martin
Re: [PATCH] blk-mq: start request gstate with gen 1
Hi Jianchao. jianchao.wang - 17.04.18, 16:34: > On 04/17/2018 08:10 PM, Martin Steigerwald wrote: > > For testing it I add it to 4.16.2 with the patches I have already? > > You could try to only apply this patch to have a test. :) I tested 4.16.3 with just your patch (+ the unrelated btrfs trimming fix I carry for a long time already) and it did at least 15 boots successfully (without hanging). So far also no "error loading smart data mail", but it takes a few days with suspend/hibernation + resume cycles in order to know for sure. Thanks, -- Martin
Re: [PATCH] blk-mq: start request gstate with gen 1
Hi Jianchao. jianchao.wang - 17.04.18, 16:34: > On 04/17/2018 08:10 PM, Martin Steigerwald wrote: > > For testing it I add it to 4.16.2 with the patches I have already? > > You could try to only apply this patch to have a test. Compiling now to have a test. Thanks, -- Martin
Re: [PATCH] blk-mq: start request gstate with gen 1
Hi Jianchao. jianchao.wang - 17.04.18, 16:34: > On 04/17/2018 08:10 PM, Martin Steigerwald wrote: > > For testing it I add it to 4.16.2 with the patches I have already? > > You could try to only apply this patch to have a test. Compiling now to have a test. Thanks, -- Martin
Re: [PATCH] blk-mq: start request gstate with gen 1
Hi Jianchao, Jianchao Wang - 17.04.18, 05:46: > rq->gstate and rq->aborted_gstate both are zero before rqs are > allocated. If we have a small timeout, when the timer fires, > there could be rqs that are never allocated, and also there could > be rq that has been allocated but not initialized and started. At > the moment, the rq->gstate and rq->aborted_gstate both are 0, thus > the blk_mq_terminate_expired will identify the rq is timed out and > invoke .timeout early. For testing it I add it to 4.16.2 with the patches I have already? - '[PATCH] blk-mq_Directly schedule q->timeout_work when aborting a request.mbox' - '[PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}().mbox' - '[PATCH V4 1_2] blk-mq_set RQF_MQ_TIMEOUT_EXPIRED when the rq'\''s timeout isn'\''t handled.mbox' - '[PATCH V4 2_2] blk-mq_fix race between complete and BLK_EH_RESET_TIMER.mbox' > For scsi, this will cause scsi_times_out to be invoked before the > scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at > the moment, then we will get crash. > > Cc: Bart Van Assche <bart.vanass...@wdc.com> > Cc: Tejun Heo <t...@kernel.org> > Cc: Ming Lei <ming@redhat.com> > Cc: Martin Steigerwald <mar...@lichtvoll.de> > Cc: sta...@vger.kernel.org > Signed-off-by: Jianchao Wang <jianchao.w.w...@oracle.com> > --- > block/blk-core.c | 4 > block/blk-mq.c | 7 +++ > 2 files changed, 11 insertions(+) > > diff --git a/block/blk-core.c b/block/blk-core.c > index abcb868..ce62681 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct > request *rq) rq->part = NULL; > seqcount_init(>gstate_seq); > u64_stats_init(>aborted_gstate_sync); > + /* > + * See comment of blk_mq_init_request > + */ > + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); > } > EXPORT_SYMBOL(blk_rq_init); > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index f5c7dbc..d62030a 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct > blk_mq_tag_set *set, struct request *rq, > > seqcount_init(>gstate_seq); > u64_stats_init(>aborted_gstate_sync); > + /* > + * start gstate with gen 1 instead of 0, otherwise it will be equal > + * to aborted_gstate, and be identified timed out by > + * blk_mq_terminate_expired. > + */ > + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); > + > return 0; > } -- Martin
Re: [PATCH] blk-mq: start request gstate with gen 1
Hi Jianchao, Jianchao Wang - 17.04.18, 05:46: > rq->gstate and rq->aborted_gstate both are zero before rqs are > allocated. If we have a small timeout, when the timer fires, > there could be rqs that are never allocated, and also there could > be rq that has been allocated but not initialized and started. At > the moment, the rq->gstate and rq->aborted_gstate both are 0, thus > the blk_mq_terminate_expired will identify the rq is timed out and > invoke .timeout early. For testing it I add it to 4.16.2 with the patches I have already? - '[PATCH] blk-mq_Directly schedule q->timeout_work when aborting a request.mbox' - '[PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}().mbox' - '[PATCH V4 1_2] blk-mq_set RQF_MQ_TIMEOUT_EXPIRED when the rq'\''s timeout isn'\''t handled.mbox' - '[PATCH V4 2_2] blk-mq_fix race between complete and BLK_EH_RESET_TIMER.mbox' > For scsi, this will cause scsi_times_out to be invoked before the > scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at > the moment, then we will get crash. > > Cc: Bart Van Assche > Cc: Tejun Heo > Cc: Ming Lei > Cc: Martin Steigerwald > Cc: sta...@vger.kernel.org > Signed-off-by: Jianchao Wang > --- > block/blk-core.c | 4 > block/blk-mq.c | 7 +++ > 2 files changed, 11 insertions(+) > > diff --git a/block/blk-core.c b/block/blk-core.c > index abcb868..ce62681 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -201,6 +201,10 @@ void blk_rq_init(struct request_queue *q, struct > request *rq) rq->part = NULL; > seqcount_init(>gstate_seq); > u64_stats_init(>aborted_gstate_sync); > + /* > + * See comment of blk_mq_init_request > + */ > + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); > } > EXPORT_SYMBOL(blk_rq_init); > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index f5c7dbc..d62030a 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -2069,6 +2069,13 @@ static int blk_mq_init_request(struct > blk_mq_tag_set *set, struct request *rq, > > seqcount_init(>gstate_seq); > u64_stats_init(>aborted_gstate_sync); > + /* > + * start gstate with gen 1 instead of 0, otherwise it will be equal > + * to aborted_gstate, and be identified timed out by > + * blk_mq_terminate_expired. > + */ > + WRITE_ONCE(rq->gstate, MQ_RQ_GEN_INC); > + > return 0; > } -- Martin
Re: [PATCH] blk-mq: Directly schedule q->timeout_work when aborting a request
Martin Steigerwald - 10.04.18, 20:43: > Tejun Heo - 03.04.18, 00:04: > > Request abortion is performed by overriding deadline to now and > > scheduling timeout handling immediately. For the latter part, the > > code was using mod_timer(timeout, 0) which can't guarantee that the > > timer runs afterwards. Let's schedule the underlying work item > > directly instead. > > > > This fixes the hangs during probing reported by Sitsofe but it isn't > > yet clear to me how the failure can happen reliably if it's just the > > above described race condition. > > Compiling a 4.16.1 kernel with that patch to test whether this fixes > the boot hang I reported in: > > [Possible REGRESSION, 4.16-rc4] Error updating SMART data during > runtime and boot failures with blk_mq_terminate_expired in backtrace > https://bugzilla.kernel.org/show_bug.cgi?id=199077 Fails as well, see https://bugzilla.kernel.org/show_bug.cgi?id=199077#c8 for photo with (part of) backtrace. > The "Error updating SMART data during runtime" thing I reported there > as well may still be another (independent) issue. > > > Signed-off-by: Tejun Heo <t...@kernel.org> > > Reported-by: Sitsofe Wheeler <sits...@gmail.com> > > Reported-by: Meelis Roos <mr...@linux.ee> > > Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger > > timeout path") Cc: sta...@vger.kernel.org # v4.16 > > Link: > > http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8 > > NyOF-EM a...@mail.gmail.com Link: > > http://lkml.kernel.org/r/alpine.lrh.2.21.1802261049140.4...@math.ut. > > ee --- Hello, > > > > I don't have the full explanation yet but here's a preliminary > > patch. > > > > Thanks. > > > > block/blk-timeout.c |2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/blk-timeout.c b/block/blk-timeout.c > > index a05e367..f0e6e41 100644 > > --- a/block/blk-timeout.c > > +++ b/block/blk-timeout.c > > @@ -165,7 +165,7 @@ void blk_abort_request(struct request *req) > > > > * No need for fancy synchronizations. > > */ > > > > blk_rq_set_deadline(req, jiffies); > > > > - mod_timer(>q->timeout, 0); > > + kblockd_schedule_work(>q->timeout_work); > > > > } else { > > > > if (blk_mark_rq_complete(req)) > > > > return; -- Martin
Re: [PATCH] blk-mq: Directly schedule q->timeout_work when aborting a request
Martin Steigerwald - 10.04.18, 20:43: > Tejun Heo - 03.04.18, 00:04: > > Request abortion is performed by overriding deadline to now and > > scheduling timeout handling immediately. For the latter part, the > > code was using mod_timer(timeout, 0) which can't guarantee that the > > timer runs afterwards. Let's schedule the underlying work item > > directly instead. > > > > This fixes the hangs during probing reported by Sitsofe but it isn't > > yet clear to me how the failure can happen reliably if it's just the > > above described race condition. > > Compiling a 4.16.1 kernel with that patch to test whether this fixes > the boot hang I reported in: > > [Possible REGRESSION, 4.16-rc4] Error updating SMART data during > runtime and boot failures with blk_mq_terminate_expired in backtrace > https://bugzilla.kernel.org/show_bug.cgi?id=199077 Fails as well, see https://bugzilla.kernel.org/show_bug.cgi?id=199077#c8 for photo with (part of) backtrace. > The "Error updating SMART data during runtime" thing I reported there > as well may still be another (independent) issue. > > > Signed-off-by: Tejun Heo > > Reported-by: Sitsofe Wheeler > > Reported-by: Meelis Roos > > Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger > > timeout path") Cc: sta...@vger.kernel.org # v4.16 > > Link: > > http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8 > > NyOF-EM a...@mail.gmail.com Link: > > http://lkml.kernel.org/r/alpine.lrh.2.21.1802261049140.4...@math.ut. > > ee --- Hello, > > > > I don't have the full explanation yet but here's a preliminary > > patch. > > > > Thanks. > > > > block/blk-timeout.c |2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/blk-timeout.c b/block/blk-timeout.c > > index a05e367..f0e6e41 100644 > > --- a/block/blk-timeout.c > > +++ b/block/blk-timeout.c > > @@ -165,7 +165,7 @@ void blk_abort_request(struct request *req) > > > > * No need for fancy synchronizations. > > */ > > > > blk_rq_set_deadline(req, jiffies); > > > > - mod_timer(>q->timeout, 0); > > + kblockd_schedule_work(>q->timeout_work); > > > > } else { > > > > if (blk_mark_rq_complete(req)) > > > > return; -- Martin
Re: [PATCH] blk-mq: Directly schedule q->timeout_work when aborting a request
Tejun Heo - 03.04.18, 00:04: > Request abortion is performed by overriding deadline to now and > scheduling timeout handling immediately. For the latter part, the > code was using mod_timer(timeout, 0) which can't guarantee that the > timer runs afterwards. Let's schedule the underlying work item > directly instead. > > This fixes the hangs during probing reported by Sitsofe but it isn't > yet clear to me how the failure can happen reliably if it's just the > above described race condition. Compiling a 4.16.1 kernel with that patch to test whether this fixes the boot hang I reported in: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and boot failures with blk_mq_terminate_expired in backtrace https://bugzilla.kernel.org/show_bug.cgi?id=199077 The "Error updating SMART data during runtime" thing I reported there as well may still be another (independent) issue. > Signed-off-by: Tejun Heo> Reported-by: Sitsofe Wheeler > Reported-by: Meelis Roos > Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger timeout > path") Cc: sta...@vger.kernel.org # v4.16 > Link: > http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8NyOF-EM > a...@mail.gmail.com Link: > http://lkml.kernel.org/r/alpine.lrh.2.21.1802261049140.4...@math.ut.ee --- > Hello, > > I don't have the full explanation yet but here's a preliminary patch. > > Thanks. > > block/blk-timeout.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-timeout.c b/block/blk-timeout.c > index a05e367..f0e6e41 100644 > --- a/block/blk-timeout.c > +++ b/block/blk-timeout.c > @@ -165,7 +165,7 @@ void blk_abort_request(struct request *req) >* No need for fancy synchronizations. >*/ > blk_rq_set_deadline(req, jiffies); > - mod_timer(>q->timeout, 0); > + kblockd_schedule_work(>q->timeout_work); > } else { > if (blk_mark_rq_complete(req)) > return; -- Martin
Re: [PATCH] blk-mq: Directly schedule q->timeout_work when aborting a request
Tejun Heo - 03.04.18, 00:04: > Request abortion is performed by overriding deadline to now and > scheduling timeout handling immediately. For the latter part, the > code was using mod_timer(timeout, 0) which can't guarantee that the > timer runs afterwards. Let's schedule the underlying work item > directly instead. > > This fixes the hangs during probing reported by Sitsofe but it isn't > yet clear to me how the failure can happen reliably if it's just the > above described race condition. Compiling a 4.16.1 kernel with that patch to test whether this fixes the boot hang I reported in: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and boot failures with blk_mq_terminate_expired in backtrace https://bugzilla.kernel.org/show_bug.cgi?id=199077 The "Error updating SMART data during runtime" thing I reported there as well may still be another (independent) issue. > Signed-off-by: Tejun Heo > Reported-by: Sitsofe Wheeler > Reported-by: Meelis Roos > Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger timeout > path") Cc: sta...@vger.kernel.org # v4.16 > Link: > http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8NyOF-EM > a...@mail.gmail.com Link: > http://lkml.kernel.org/r/alpine.lrh.2.21.1802261049140.4...@math.ut.ee --- > Hello, > > I don't have the full explanation yet but here's a preliminary patch. > > Thanks. > > block/blk-timeout.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-timeout.c b/block/blk-timeout.c > index a05e367..f0e6e41 100644 > --- a/block/blk-timeout.c > +++ b/block/blk-timeout.c > @@ -165,7 +165,7 @@ void blk_abort_request(struct request *req) >* No need for fancy synchronizations. >*/ > blk_rq_set_deadline(req, jiffies); > - mod_timer(>q->timeout, 0); > + kblockd_schedule_work(>q->timeout_work); > } else { > if (blk_mark_rq_complete(req)) > return; -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Hans de Goede - 19.03.18, 10:50: > > Martin (or someone else): Could you gibe a status update? I have this > > issue on my list or regressions, but it's hard to follow as two > > different issues seem to be discussed. Or is it just one issue? Did the > > patch/discussion that Bart pointed to help? Is the issue still showing > > up in rc6? > > Your right there are 2 issues here: […] > 2) There seem to be some latency issues in the MU03 version of the > firmware, triggered by polling SMART data, which causes lvmetad to > timeout in some cases. Note I'm not involved in that part of this > thread, but I believe that issue is currently unresolved. The second issue consists of what Hans described + an occassional hang on boot for resume from hibernation to disk. The second issue is still unfixed as of 4.16 + [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() vom Bart Van Asche, which Jens Axboe accepted¹. [1] https://patchwork.kernel.org/patch/10294287/ Currently compiling 4.16.1, but I do not expect a change, as there is nothing about blk-mq subsystem in the changelog as far as I saw. Will update [Bug 199077] [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and boot failures with blk_mq_terminate_expired in backtrace https://bugzilla.kernel.org/show_bug.cgi?id=199077 as well about the current state. The bug report contains a screenshot of one of the boot hangs. I had two more on Monday, but did not take the chance to make another photo. I will do so next time in case its convenient enough and compare whether it reveals anything more than my first photo. Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Hans de Goede - 19.03.18, 10:50: > > Martin (or someone else): Could you gibe a status update? I have this > > issue on my list or regressions, but it's hard to follow as two > > different issues seem to be discussed. Or is it just one issue? Did the > > patch/discussion that Bart pointed to help? Is the issue still showing > > up in rc6? > > Your right there are 2 issues here: […] > 2) There seem to be some latency issues in the MU03 version of the > firmware, triggered by polling SMART data, which causes lvmetad to > timeout in some cases. Note I'm not involved in that part of this > thread, but I believe that issue is currently unresolved. The second issue consists of what Hans described + an occassional hang on boot for resume from hibernation to disk. The second issue is still unfixed as of 4.16 + [PATCH v2] block: Change a rcu_read_{lock,unlock}_sched() pair into rcu_read_{lock,unlock}() vom Bart Van Asche, which Jens Axboe accepted¹. [1] https://patchwork.kernel.org/patch/10294287/ Currently compiling 4.16.1, but I do not expect a change, as there is nothing about blk-mq subsystem in the changelog as far as I saw. Will update [Bug 199077] [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and boot failures with blk_mq_terminate_expired in backtrace https://bugzilla.kernel.org/show_bug.cgi?id=199077 as well about the current state. The bug report contains a screenshot of one of the boot hangs. I had two more on Monday, but did not take the chance to make another photo. I will do so next time in case its convenient enough and compare whether it reveals anything more than my first photo. Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Hi Thorsten. Hans de Goede - 19.03.18, 10:50: > On 19-03-18 10:42, Thorsten Leemhuis wrote: > > Hi! On 11.03.2018 09:20, Martin Steigerwald wrote: > >> Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > > > >> with SMART checks occassionally failing like this: > > Martin (or someone else): Could you gibe a status update? I have this > > issue on my list or regressions, but it's hard to follow as two > > different issues seem to be discussed. Or is it just one issue? Did the There are at least two issues. > > patch/discussion that Bart pointed to help? Is the issue still showing > > up in rc6? > > Your right there are 2 issues here: > > 1) The Crucial M500 SSD (at least the 480GB MU03 firmware version) does > not like enabling SATA link power-management at a level of min_power > or at the new(ish) med_power_with_dipm level. This problem exists in > older kernels too, so this is not really a regression. > > New in 4.16 is a Kconfig option to enable SATA LPM by default, which > makes this existing problem much more noticeable. Not sure if you want > to count this as a regression. Either way I'm preparing and sending > out a patch fixing this (by blacklisting LPM for this model SSD) right > now. Yes, and this is fixed by the nolpm quirk patch of Hans. > 2) There seem to be some latency issues in the MU03 version of the > firmware, triggered by polling SMART data, which causes lvmetad to > timeout in some cases. Note I'm not involved in that part of this > thread, but I believe that issue is currently unresolved. Additionally I get a failure on boot / resume from hibernation in blk_mq_terminate_expire occassionally. But I tend to believe that this is the same issue. This is still unresolved as of 4.16-rc6 + nolpm quick patch from Hans and the "Change synchronize_rcu() in scsi_device_quiesce() into synchronize_sched()" patch by Bart. Cause I had this occassional boot failure with it already. The patch by Bart seems to be related to another issue of the blk-mq quiescing stuff. Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Hi Thorsten. Hans de Goede - 19.03.18, 10:50: > On 19-03-18 10:42, Thorsten Leemhuis wrote: > > Hi! On 11.03.2018 09:20, Martin Steigerwald wrote: > >> Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > > > >> with SMART checks occassionally failing like this: > > Martin (or someone else): Could you gibe a status update? I have this > > issue on my list or regressions, but it's hard to follow as two > > different issues seem to be discussed. Or is it just one issue? Did the There are at least two issues. > > patch/discussion that Bart pointed to help? Is the issue still showing > > up in rc6? > > Your right there are 2 issues here: > > 1) The Crucial M500 SSD (at least the 480GB MU03 firmware version) does > not like enabling SATA link power-management at a level of min_power > or at the new(ish) med_power_with_dipm level. This problem exists in > older kernels too, so this is not really a regression. > > New in 4.16 is a Kconfig option to enable SATA LPM by default, which > makes this existing problem much more noticeable. Not sure if you want > to count this as a regression. Either way I'm preparing and sending > out a patch fixing this (by blacklisting LPM for this model SSD) right > now. Yes, and this is fixed by the nolpm quirk patch of Hans. > 2) There seem to be some latency issues in the MU03 version of the > firmware, triggered by polling SMART data, which causes lvmetad to > timeout in some cases. Note I'm not involved in that part of this > thread, but I believe that issue is currently unresolved. Additionally I get a failure on boot / resume from hibernation in blk_mq_terminate_expire occassionally. But I tend to believe that this is the same issue. This is still unresolved as of 4.16-rc6 + nolpm quick patch from Hans and the "Change synchronize_rcu() in scsi_device_quiesce() into synchronize_sched()" patch by Bart. Cause I had this occassional boot failure with it already. The patch by Bart seems to be related to another issue of the blk-mq quiescing stuff. Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Hi Hans. Hans de Goede - 18.03.18, 22:34: > On 14-03-18 13:48, Martin Steigerwald wrote: > > Hans de Goede - 14.03.18, 12:05: > >> Hi, > >> > >> On 14-03-18 12:01, Martin Steigerwald wrote: > >>> Hans de Goede - 11.03.18, 15:37: > >>>> Hi Martin, > >>>> > >>>> On 11-03-18 09:20, Martin Steigerwald wrote: > >>>>> Hello. > >>>>> > >>>>> Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > >>>>> with SMART checks occassionally failing like this: > >>>>> > >>>>> smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending > >>>>> checks > >>>>> udisksd[24408]: Error performing housekeeping for drive > >>>>> /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error > >>>>> updating > >>>>> SMART data: Error sending ATA command CHECK POWER MODE: Unexpected > >>>>> sense > >>>>> data returned:#012: 0e 09 0c 00 00 00 ff 00 00 00 00 00 00 00 > >>>>> 50 > >>>>> 00..P.#0120010: 00 00 00 00 00 00 00 00 00 00 00 00 > >>>>> 00 > >>>>> 00 00 00#012 (g-io-error-quark, 0) merkaba > >>>>> udisksd[24408]: Error performing housekeeping for drive > >>>>> /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error > >>>>> updating > >>>>> SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected > >>>>> sense > >>>>> data returned:#012: 01 00 1d 00 00 00 0e 09 0c 00 00 00 ff 00 > >>>>> 00 > >>>>> 00#0120010: 00 0 0 00 00 50 00 00 00 00 00 00 00 > >>>>> 00 00 00 00P...#012 (g-io-error-quark, 0) > >>>>> > >>>>> (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad > >>>>> T520) > >>>>> > >>>>> However when I then check manually with smartctl -a | -x | -H the > >>>>> device > >>>>> reports SMART data just fine. > >>>>> > >>>>> As smartd correctly detects that device is in sleep mode, this may be > >>>>> an > >>>>> userspace issue in udisksd. > >>>>> > >>>>> Also at some boot attempts the boot hangs with a message like "could > >>>>> not > >>>>> connect to lvmetad, scanning manually for devices". I use BTRFS RAID 1 > >>>>> on to LVs (each on one of the SSDs). A configuration that requires a > >>>>> manual > >>>>> adaption to InitRAMFS in order to boot (basically vgchange -ay before > >>>>> btrfs device scan). > >>>>> > >>>>> I wonder whether that has to do with the new SATA LPM policy stuff, > >>>>> but > >>>>> as > >>>>> I had issues with > >>>>> > >>>>> 3 => Medium power with Device Initiated PM enabled > >>>>> > >>>>> (machine did not boot, which could also have been caused by me > >>>>> accidentally > >>>>> removing all TCP/IP network support in the kernel with that setting) > >>>>> > >>>>> I set it back to > >>>>> > >>>>> CONFIG_SATA_MOBILE_LPM_POLICY=0 > >>>>> > >>>>> (firmware settings) > >>>> > >>>> Right, so at that settings the LPM policy changes are effectively > >>>> disabled and cannot explain your SMART issues. > >>>> > >>>> Still I would like to zoom in on this part of your bug report, because > >>>> for Fedora 28 we are planning to ship with > >>>> CONFIG_SATA_MOBILE_LPM_POLICY=3 > >>>> and AFAIK Ubuntu has similar plans. > >>>> > >>>> I suspect that the issue you were seeing with > >>>> CONFIG_SATA_MOBILE_LPM_POLICY=3 were with the Crucial disk ? I've > >>>> attached > >>>> a patch for you to test, which disabled LPM for your model Crucial SSD > >>>> (but > >>>> keeps it on for the Intel disk) if you can confirm that with that patch > >>>> you > >>>> can run with > >>>> CONFIG_SATA_MOBILE_LPM_POLICY=3 without issues that would be great. > >>> > >>> With 4.16-rc5 with CONFIG_SATA_MOBILE_LPM_POLICY=3 the system > >>> successfully > >>> booted three times in a row. So feel free to add tested-by. > >> > >> Thanks. > >> > >> To be clear, you're talking about 4.16-rc5 with the patch I made to > >> blacklist the Crucial disk I assume, not just plain 4.16-rc5, right ? > > > > 4.16-rc5 with your > > > > 0001-libata-Apply-NOLPM-quirk-to-Crucial-M500-480GB-SSDs.patch > > I was about to submit this upstream and was planning on extending it to > also cover the 960GB version, which lead to me doing a quick google. > Judging from the google results it seems that there are multiple firmware > versions of this SSD out there and I wonder if you are perhaps running > an older version of the firmware. If you do: > > dmesg | grep Crucial_CT480M500 > > You should see something like this: > > ata2.00: ATA-9: Crucial_CT480M500SSD3, MU03, max UDMA/133 > > I'm interested in the "MU03" part, what is that in your case? Although I never updated the firmware, I do have MU03: % lsscsi | grep Crucial [2:0:0:0]diskATA Crucial_CT480M50 MU03 /dev/sdb % dmesg | grep Crucial_CT480M500 [2.424537] ata3.00: ATA-9: Crucial_CT480M500SSD3, MU03, max UDMA/133 > Note I'm not saying we should not do the NOLPM quirk, but maybe we > can limit it to older firmware. Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Hi Hans. Hans de Goede - 18.03.18, 22:34: > On 14-03-18 13:48, Martin Steigerwald wrote: > > Hans de Goede - 14.03.18, 12:05: > >> Hi, > >> > >> On 14-03-18 12:01, Martin Steigerwald wrote: > >>> Hans de Goede - 11.03.18, 15:37: > >>>> Hi Martin, > >>>> > >>>> On 11-03-18 09:20, Martin Steigerwald wrote: > >>>>> Hello. > >>>>> > >>>>> Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > >>>>> with SMART checks occassionally failing like this: > >>>>> > >>>>> smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending > >>>>> checks > >>>>> udisksd[24408]: Error performing housekeeping for drive > >>>>> /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error > >>>>> updating > >>>>> SMART data: Error sending ATA command CHECK POWER MODE: Unexpected > >>>>> sense > >>>>> data returned:#012: 0e 09 0c 00 00 00 ff 00 00 00 00 00 00 00 > >>>>> 50 > >>>>> 00..P.#0120010: 00 00 00 00 00 00 00 00 00 00 00 00 > >>>>> 00 > >>>>> 00 00 00#012 (g-io-error-quark, 0) merkaba > >>>>> udisksd[24408]: Error performing housekeeping for drive > >>>>> /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error > >>>>> updating > >>>>> SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected > >>>>> sense > >>>>> data returned:#012: 01 00 1d 00 00 00 0e 09 0c 00 00 00 ff 00 > >>>>> 00 > >>>>> 00#0120010: 00 0 0 00 00 50 00 00 00 00 00 00 00 > >>>>> 00 00 00 00P...#012 (g-io-error-quark, 0) > >>>>> > >>>>> (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad > >>>>> T520) > >>>>> > >>>>> However when I then check manually with smartctl -a | -x | -H the > >>>>> device > >>>>> reports SMART data just fine. > >>>>> > >>>>> As smartd correctly detects that device is in sleep mode, this may be > >>>>> an > >>>>> userspace issue in udisksd. > >>>>> > >>>>> Also at some boot attempts the boot hangs with a message like "could > >>>>> not > >>>>> connect to lvmetad, scanning manually for devices". I use BTRFS RAID 1 > >>>>> on to LVs (each on one of the SSDs). A configuration that requires a > >>>>> manual > >>>>> adaption to InitRAMFS in order to boot (basically vgchange -ay before > >>>>> btrfs device scan). > >>>>> > >>>>> I wonder whether that has to do with the new SATA LPM policy stuff, > >>>>> but > >>>>> as > >>>>> I had issues with > >>>>> > >>>>> 3 => Medium power with Device Initiated PM enabled > >>>>> > >>>>> (machine did not boot, which could also have been caused by me > >>>>> accidentally > >>>>> removing all TCP/IP network support in the kernel with that setting) > >>>>> > >>>>> I set it back to > >>>>> > >>>>> CONFIG_SATA_MOBILE_LPM_POLICY=0 > >>>>> > >>>>> (firmware settings) > >>>> > >>>> Right, so at that settings the LPM policy changes are effectively > >>>> disabled and cannot explain your SMART issues. > >>>> > >>>> Still I would like to zoom in on this part of your bug report, because > >>>> for Fedora 28 we are planning to ship with > >>>> CONFIG_SATA_MOBILE_LPM_POLICY=3 > >>>> and AFAIK Ubuntu has similar plans. > >>>> > >>>> I suspect that the issue you were seeing with > >>>> CONFIG_SATA_MOBILE_LPM_POLICY=3 were with the Crucial disk ? I've > >>>> attached > >>>> a patch for you to test, which disabled LPM for your model Crucial SSD > >>>> (but > >>>> keeps it on for the Intel disk) if you can confirm that with that patch > >>>> you > >>>> can run with > >>>> CONFIG_SATA_MOBILE_LPM_POLICY=3 without issues that would be great. > >>> > >>> With 4.16-rc5 with CONFIG_SATA_MOBILE_LPM_POLICY=3 the system > >>> successfully > >>> booted three times in a row. So feel free to add tested-by. > >> > >> Thanks. > >> > >> To be clear, you're talking about 4.16-rc5 with the patch I made to > >> blacklist the Crucial disk I assume, not just plain 4.16-rc5, right ? > > > > 4.16-rc5 with your > > > > 0001-libata-Apply-NOLPM-quirk-to-Crucial-M500-480GB-SSDs.patch > > I was about to submit this upstream and was planning on extending it to > also cover the 960GB version, which lead to me doing a quick google. > Judging from the google results it seems that there are multiple firmware > versions of this SSD out there and I wonder if you are perhaps running > an older version of the firmware. If you do: > > dmesg | grep Crucial_CT480M500 > > You should see something like this: > > ata2.00: ATA-9: Crucial_CT480M500SSD3, MU03, max UDMA/133 > > I'm interested in the "MU03" part, what is that in your case? Although I never updated the firmware, I do have MU03: % lsscsi | grep Crucial [2:0:0:0]diskATA Crucial_CT480M50 MU03 /dev/sdb % dmesg | grep Crucial_CT480M500 [2.424537] ata3.00: ATA-9: Crucial_CT480M500SSD3, MU03, max UDMA/133 > Note I'm not saying we should not do the NOLPM quirk, but maybe we > can limit it to older firmware. Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Martin Steigerwald - 14.03.18, 12:01: > Hans de Goede - 11.03.18, 15:37: > > Hi Martin, > > > > On 11-03-18 09:20, Martin Steigerwald wrote: > > > Hello. > > > > > > Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > > > with SMART checks occassionally failing like this: > > > > > > smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending > > > checks > > > udisksd[24408]: Error performing housekeeping for drive > > > /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error updating > > > SMART data: Error sending ATA command CHECK POWER MODE: Unexpected sense > > > data returned:#012: 0e 09 0c 00 00 00 ff 00 00 00 00 00 00 00 50 > > > 00..P.#0120010: 00 00 00 00 00 00 00 00 00 00 00 00 > > > 00 > > > 00 00 00#012 (g-io-error-quark, 0) merkaba > > > udisksd[24408]: Error performing housekeeping for drive > > > /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error > > > updating > > > SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected > > > sense > > > data returned:#012: 01 00 1d 00 00 00 0e 09 0c 00 00 00 ff 00 00 > > > 00#0120010: 00 0 0 00 00 50 00 00 00 00 00 00 00 > > > 00 00 00 00P...#012 (g-io-error-quark, 0) > > > > > > (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad T520) > > > > > > However when I then check manually with smartctl -a | -x | -H the device > > > reports SMART data just fine. > > > > > > As smartd correctly detects that device is in sleep mode, this may be an > > > userspace issue in udisksd. > > > > > > Also at some boot attempts the boot hangs with a message like "could not > > > connect to lvmetad, scanning manually for devices". I use BTRFS RAID 1 > > > on to LVs (each on one of the SSDs). A configuration that requires a > > > manual > > > adaption to InitRAMFS in order to boot (basically vgchange -ay before > > > btrfs device scan). > > > > > > I wonder whether that has to do with the new SATA LPM policy stuff, but > > > as > > > I had issues with > > > > > > 3 => Medium power with Device Initiated PM enabled > > > > > > (machine did not boot, which could also have been caused by me > > > accidentally > > > removing all TCP/IP network support in the kernel with that setting) > > > > > > I set it back to > > > > > > CONFIG_SATA_MOBILE_LPM_POLICY=0 > > > > > > (firmware settings) > > > > Right, so at that settings the LPM policy changes are effectively > > disabled and cannot explain your SMART issues. > > > > Still I would like to zoom in on this part of your bug report, because > > for Fedora 28 we are planning to ship with CONFIG_SATA_MOBILE_LPM_POLICY=3 > > and AFAIK Ubuntu has similar plans. > > > > I suspect that the issue you were seeing with > > CONFIG_SATA_MOBILE_LPM_POLICY=3 were with the Crucial disk ? I've attached > > a patch for you to test, which disabled LPM for your model Crucial SSD > > (but > > keeps it on for the Intel disk) if you can confirm that with that patch > > you > > can run with > > CONFIG_SATA_MOBILE_LPM_POLICY=3 without issues that would be great. > > With 4.16-rc5 with CONFIG_SATA_MOBILE_LPM_POLICY=3 the system successfully > booted three times in a row. So feel free to add tested-by. > > Let´s see whether the blk_mq_terminate_expired or the smartd/udisks error > messages reappear with rc5. I still think they are a different issue. As expected these two other issues still happen with 4.16-rc5 Thanks, -- Martin
Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts
Martin Steigerwald - 14.03.18, 12:01: > Hans de Goede - 11.03.18, 15:37: > > Hi Martin, > > > > On 11-03-18 09:20, Martin Steigerwald wrote: > > > Hello. > > > > > > Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > > > with SMART checks occassionally failing like this: > > > > > > smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending > > > checks > > > udisksd[24408]: Error performing housekeeping for drive > > > /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error updating > > > SMART data: Error sending ATA command CHECK POWER MODE: Unexpected sense > > > data returned:#012: 0e 09 0c 00 00 00 ff 00 00 00 00 00 00 00 50 > > > 00..P.#0120010: 00 00 00 00 00 00 00 00 00 00 00 00 > > > 00 > > > 00 00 00#012 (g-io-error-quark, 0) merkaba > > > udisksd[24408]: Error performing housekeeping for drive > > > /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error > > > updating > > > SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected > > > sense > > > data returned:#012: 01 00 1d 00 00 00 0e 09 0c 00 00 00 ff 00 00 > > > 00#0120010: 00 0 0 00 00 50 00 00 00 00 00 00 00 > > > 00 00 00 00P...#012 (g-io-error-quark, 0) > > > > > > (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad T520) > > > > > > However when I then check manually with smartctl -a | -x | -H the device > > > reports SMART data just fine. > > > > > > As smartd correctly detects that device is in sleep mode, this may be an > > > userspace issue in udisksd. > > > > > > Also at some boot attempts the boot hangs with a message like "could not > > > connect to lvmetad, scanning manually for devices". I use BTRFS RAID 1 > > > on to LVs (each on one of the SSDs). A configuration that requires a > > > manual > > > adaption to InitRAMFS in order to boot (basically vgchange -ay before > > > btrfs device scan). > > > > > > I wonder whether that has to do with the new SATA LPM policy stuff, but > > > as > > > I had issues with > > > > > > 3 => Medium power with Device Initiated PM enabled > > > > > > (machine did not boot, which could also have been caused by me > > > accidentally > > > removing all TCP/IP network support in the kernel with that setting) > > > > > > I set it back to > > > > > > CONFIG_SATA_MOBILE_LPM_POLICY=0 > > > > > > (firmware settings) > > > > Right, so at that settings the LPM policy changes are effectively > > disabled and cannot explain your SMART issues. > > > > Still I would like to zoom in on this part of your bug report, because > > for Fedora 28 we are planning to ship with CONFIG_SATA_MOBILE_LPM_POLICY=3 > > and AFAIK Ubuntu has similar plans. > > > > I suspect that the issue you were seeing with > > CONFIG_SATA_MOBILE_LPM_POLICY=3 were with the Crucial disk ? I've attached > > a patch for you to test, which disabled LPM for your model Crucial SSD > > (but > > keeps it on for the Intel disk) if you can confirm that with that patch > > you > > can run with > > CONFIG_SATA_MOBILE_LPM_POLICY=3 without issues that would be great. > > With 4.16-rc5 with CONFIG_SATA_MOBILE_LPM_POLICY=3 the system successfully > booted three times in a row. So feel free to add tested-by. > > Let´s see whether the blk_mq_terminate_expired or the smartd/udisks error > messages reappear with rc5. I still think they are a different issue. As expected these two other issues still happen with 4.16-rc5 Thanks, -- Martin