Re: 9.1: boot-time delay? [WORKAROUND FOUND]
bou...@antioche.eu.org (Manuel Bouyer) writes: >Another issue could be mstohz() called with a delay too short; >mstohz() will round it up to 1 tick. # define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul)) If mstohz() would round up to full ticks, it could actually avoid some pitfalls. But it doesn't.
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On Tue, May 25, 2021 at 04:04:56PM -0400, Mouse wrote: > > I suppose it's not possible to configure ahcisata in the BIOS on the > > long-delay machines? > > Thank you very much! Yes. That is possible - and it fixes the delay. > I would not have thought to look for that; I would not have expected > piixide and ahcisata to be similar enough that a BIOS setting could > personality-swap between them. > > > I'm guessing this is some quirk of the pciide(4) and piixide(4) > > drivers. > > Sounds like; they presumably have a bug somewhere in some delay > calculation. But at least I have something approaching a workaround. The reset and probe procedure is different bewteen ide and ahci. The problem is probably in this area. Actually the root cause may be a delay too short, not too long. AFAIK the code uses mstohz() everywhere to compute tick values, exept maybe a few cases where we want a really short delay and we use 1. This is where a very high HZ may cause a too short delay. Another issue could be mstohz() called with a delay too short; mstohz() will round it up to 1 tick. But I think you will need to instrument the ide probe in dev/ic/wdc.c. It's been a while since I last looked at this, but I think the code you want to look at is wdc_drvprobe(). -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: 9.1: boot-time delay?
On Tue, May 25, 2021 at 10:21:40PM +0200, Christoph Badura wrote: > On Wed, May 26, 2021 at 05:02:52AM +1000, matthew green wrote: > > > > +optionsHZ=8000 > > this can become a problem due to integer division. > > > > any number of ticks less than hz (8000) will be rounded > > down to 0 in a number of places now, where as before it > > was only less than 100. i've seen this trip up in the > > kernel before, and sometimes that '0' means 'poll', and > > sometimes it means 'sleep forever'. > > > a lot of places in the kernel *do* avoid (eg, with adding > > hz-1 and then dividing by hz) but there are a number that > > do not... > > So, should we introduce a CPP macro or an inline function that abstracts > the common code away in a way that avoids such rounding down and > produces correct results? We already have mstohz() and hztoms() -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
>>> I suppose it's not possible to configure ahcisata in the BIOS on >>> the long-delay machines? >> I would not have thought to look for that; I would not have expected >> piixide and ahcisata to be similar enough that a BIOS setting could >> personality-swap between them. > IIRC, they are not really that similar. But the dmesg excerpt you > cited had piixide at *Intel 6 Series Serial ATA Controller*. So, > clearly, the device had to support ahci mode. Well, "clearly" for > those in the know. Indeed! Still, I wouldn't've thought to look for it because I wasn't aware there was *any* hardware that could personality-flip like that. For someone familiar with that hardware, or with other hardware that can do it, it'd be a reasonable thing - I guess those are your "those in the know"! Which didn't include me. At least until now. :-) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: 9.1: boot-time delay?
On Wed, May 26, 2021 at 05:02:52AM +1000, matthew green wrote: > > > +options HZ=8000 > this can become a problem due to integer division. > > any number of ticks less than hz (8000) will be rounded > down to 0 in a number of places now, where as before it > was only less than 100. i've seen this trip up in the > kernel before, and sometimes that '0' means 'poll', and > sometimes it means 'sleep forever'. > a lot of places in the kernel *do* avoid (eg, with adding > hz-1 and then dividing by hz) but there are a number that > do not... So, should we introduce a CPP macro or an inline function that abstracts the common code away in a way that avoids such rounding down and produces correct results? --chris
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On Tue, May 25, 2021 at 04:04:56PM -0400, Mouse wrote: > > I suppose it's not possible to configure ahcisata in the BIOS on the > > long-delay machines? > > Thank you very much! Yes. That is possible - and it fixes the delay. > I would not have thought to look for that; I would not have expected > piixide and ahcisata to be similar enough that a BIOS setting could > personality-swap between them. IIRC, they are not really that similar. But the dmesg excerpt you cited had piixide at *Intel 6 Series Serial ATA Controller*. So, clearly, the device had to support ahci mode. Well, "clearly" for those in the know. --chris
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
> I suppose it's not possible to configure ahcisata in the BIOS on the > long-delay machines? Thank you very much! Yes. That is possible - and it fixes the delay. I would not have thought to look for that; I would not have expected piixide and ahcisata to be similar enough that a BIOS setting could personality-swap between them. > I'm guessing this is some quirk of the pciide(4) and piixide(4) > drivers. Sounds like; they presumably have a bug somewhere in some delay calculation. But at least I have something approaching a workaround. > Not to be too flip, but do you expect these machines to reboot > frequently in production? Often enough that my colleague on the customer-interface side of things thinks twenty seconds of delay on reboot is a problem. > If not, then I'd probably live with the delay on reboot as at this > point, [...] Me too. I didn't even realize it was there until they pointed it out to me; I don't really notice twenty seconds more delay when the typical peecee BIOS imposes anywhere from thirty seconds to five minutes before it even starts to boot. I'm obviously not part of their target market for this thing. > I'd be concerned that any fix I came up with for it would have > implications down the road which might be more serious and more > impactful on operations. Fair concern. > I certainly understand the need to know what's going on, but if a > machine only reboots once or twice a year in production, then ... Also fair. But I think once a day is the very fewest reboots we're likely to see for this, though I'm not really all that in touch with the user base, so I'll push this over to them and see if it's an acceptable workaround to poke the BIOS settings on the existing installed base. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: 9.1: boot-time delay?
Hello. I suppose it's not possible to configure ahcisata in the BIOS on the long-delay machines? I'm guessing this is some quirk of the pciide(4) and piixide(4) drivers. Not to be too flip, but do you expect these machines to reboot frequently in production? If not, then I'd probably live with the delay on reboot as at this point, I'd be concerned that any fix I came up with for it would have implications down the road which might be more serious and more impactful on operations. I certainly understand the need to know what's going on, but if a machine only reboots once or twice a year in production, then ... Any newer hardware should have ahcisaata capable SATA controllers, so this problem should go away as the hardware ages out. -thanks -Brian On May 25, 3:33pm, Mouse wrote: } Subject: Re: 9.1: boot-time delay? } >> Will HZ=1000 be sufficient and does that reduce the boot time? } > The latter is a good question which is likely to hint at possible } > causes. I'll experiment with various HZ values and see what happens. } } At HZ=8000, the delay (based on the bracketed numbers) is almost } exactly 22 seconds. } } At HZ=4000, it's almost exactly 10 seconds. } } At HZ=2000, it's almost exactly 4.3 seconds. } } Based on a quadratic fit to those three data points, all I need to do } is set HZ to 0 and it will reach the rest of the boot 1.2 seconds } before it gets to uhub3. Clearly, this machine includes resublimated } thiotimoline somewhere in its hardware makeup. } } Seriously, though... } } I've just heard from one of the other people working with this. I was } told that, on a different hardware platform, the delay is gone even } with HZ=8000. I have an instance of that platform among my development } machines; I tried it and I see the same thing, even with a bit-for-bit } identical kernel. } } I spent a little time playing with boot -c and disabling various } things, as suggested by Martin Husemann upthread. } } Disabling ehci* (what that hardware uses for USB) shuts off all USB } support, of course. It does nothing for the delay. } } Disabling piixide* still gets the delay, but it also still finds wd0; } it just attaches pciide* instead of piixide*. } } Disabling both piixide* and pciide* gets rid of wd* _and_ gets rid of } the delay. (It doesn't boot fully, of course, beause it has no root } device. But it reaches the root device prompt after some four seconds } instead of 25-plus.) } } Disabling wd does not fix the delay, but it doesn't completely } eliminate wd; I still see "wd at ... not configured" messages, so it's } found in some sense. (I haven't tried completely removing wd from the } kernel config.) } } The machine without the delay has wd drives, but they attach at } ahcisata instead of piixide/pciide. The piixide attachment (the } machine that exhibits the delay) is } } piixide0 at pci0 dev 31 function 2: Intel 6 Series Serial ATA Controller (rev. 0x05) } piixide0: bus-master DMA support present } piixide0: primary channel configured to native-PCI mode } piixide0: using ioapic0 pin 19 for native-PCI interrupt } atabus0 at piixide0 channel 0 } piixide0: secondary channel configured to native-PCI mode } atabus1 at piixide0 channel 1 } ... } piixide1 at pci0 dev 31 function 5: Intel 6 Series Serial ATA Controller (rev. 0x05) } piixide1: bus-master DMA support present } piixide1: primary channel wired to native-PCI mode } piixide1: using ioapic0 pin 19 for native-PCI interrupt } atabus2 at piixide1 channel 0 } piixide1: secondary channel wired to native-PCI mode } atabus3 at piixide1 channel 1 } ... } wd0 at atabus0 drive 0 } wd0: } wd0: drive supports 16-sector PIO transfers, LBA48 addressing } wd0: 465 GB, 969021 cyl, 16 head, 63 sec, 512 bytes/sect x 976773168 sectors (0 bytes/physsect; first aligned sector: 8) } wd0: 32-bit data port } wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100), WRITE DMA FUA, NCQ (32 tags) } wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA), WRITE DMA FUA EXT } wd1 at atabus2 drive 0 } wd1: } wd1: drive supports 1-sector PIO transfers, LBA48 addressing } wd1: 7641 MB, 15525 cyl, 16 head, 63 sec, 512 bytes/sect x 15649200 sectors } wd1: 32-bit data port } wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), WRITE DMA FUA, NCQ (32 tags) } wd1(piixide1:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA), WRITE DMA FUA EXT } } In contrast, the ahcisata attachment, on the machine with no delay, is } } ahcisata0 at pci0 dev 23 function 0: vendor 8086 product a102 (rev. 0x31) } ahcisata0: 64-bit DMA } ahcisata0: AHCI revision 1.31, 4 ports, 32 slots, CAP 0xe734ff43 } ahcisata0: interrupting at msi1 vec 0 } atabus0 at ahcisata0 channel 0 } atabus1 at ahcisata0 channel 1 } atabus2 at ahcisata0 channel 2 } atabus3 at ahcisata0 channel 3 } ... } ahcisata0 port 0: device present, speed: 6.0Gb/s } ahcisata0 port 3: PHY offline }
Re: 9.1: boot-time delay?
>>> +optionsHZ=8000 > this can become a problem due to integer division. > any number of ticks less than hz (8000) will be rounded down to 0 in > a number of places now, where as before it was only less than 100. Dividing by hz would be used to...all I can think of is, to convert a tick count to seconds. I have trouble imagining why/when that would be a useful thing to do, nor who would do it yielding an integer and then doing something that would affect autoconf timing with the result. Since you've seen it happen, though, obviously someone has found a use for it. What am I missing? (See also my latest update; it seems to be related somehow to piixide/pciide.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: 9.1: boot-time delay?
>> Will HZ=1000 be sufficient and does that reduce the boot time? > The latter is a good question which is likely to hint at possible > causes. I'll experiment with various HZ values and see what happens. At HZ=8000, the delay (based on the bracketed numbers) is almost exactly 22 seconds. At HZ=4000, it's almost exactly 10 seconds. At HZ=2000, it's almost exactly 4.3 seconds. Based on a quadratic fit to those three data points, all I need to do is set HZ to 0 and it will reach the rest of the boot 1.2 seconds before it gets to uhub3. Clearly, this machine includes resublimated thiotimoline somewhere in its hardware makeup. Seriously, though... I've just heard from one of the other people working with this. I was told that, on a different hardware platform, the delay is gone even with HZ=8000. I have an instance of that platform among my development machines; I tried it and I see the same thing, even with a bit-for-bit identical kernel. I spent a little time playing with boot -c and disabling various things, as suggested by Martin Husemann upthread. Disabling ehci* (what that hardware uses for USB) shuts off all USB support, of course. It does nothing for the delay. Disabling piixide* still gets the delay, but it also still finds wd0; it just attaches pciide* instead of piixide*. Disabling both piixide* and pciide* gets rid of wd* _and_ gets rid of the delay. (It doesn't boot fully, of course, beause it has no root device. But it reaches the root device prompt after some four seconds instead of 25-plus.) Disabling wd does not fix the delay, but it doesn't completely eliminate wd; I still see "wd at ... not configured" messages, so it's found in some sense. (I haven't tried completely removing wd from the kernel config.) The machine without the delay has wd drives, but they attach at ahcisata instead of piixide/pciide. The piixide attachment (the machine that exhibits the delay) is piixide0 at pci0 dev 31 function 2: Intel 6 Series Serial ATA Controller (rev. 0x05) piixide0: bus-master DMA support present piixide0: primary channel configured to native-PCI mode piixide0: using ioapic0 pin 19 for native-PCI interrupt atabus0 at piixide0 channel 0 piixide0: secondary channel configured to native-PCI mode atabus1 at piixide0 channel 1 ... piixide1 at pci0 dev 31 function 5: Intel 6 Series Serial ATA Controller (rev. 0x05) piixide1: bus-master DMA support present piixide1: primary channel wired to native-PCI mode piixide1: using ioapic0 pin 19 for native-PCI interrupt atabus2 at piixide1 channel 0 piixide1: secondary channel wired to native-PCI mode atabus3 at piixide1 channel 1 ... wd0 at atabus0 drive 0 wd0: wd0: drive supports 16-sector PIO transfers, LBA48 addressing wd0: 465 GB, 969021 cyl, 16 head, 63 sec, 512 bytes/sect x 976773168 sectors (0 bytes/physsect; first aligned sector: 8) wd0: 32-bit data port wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100), WRITE DMA FUA, NCQ (32 tags) wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA), WRITE DMA FUA EXT wd1 at atabus2 drive 0 wd1: wd1: drive supports 1-sector PIO transfers, LBA48 addressing wd1: 7641 MB, 15525 cyl, 16 head, 63 sec, 512 bytes/sect x 15649200 sectors wd1: 32-bit data port wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), WRITE DMA FUA, NCQ (32 tags) wd1(piixide1:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133) (using DMA), WRITE DMA FUA EXT In contrast, the ahcisata attachment, on the machine with no delay, is ahcisata0 at pci0 dev 23 function 0: vendor 8086 product a102 (rev. 0x31) ahcisata0: 64-bit DMA ahcisata0: AHCI revision 1.31, 4 ports, 32 slots, CAP 0xe734ff43 ahcisata0: interrupting at msi1 vec 0 atabus0 at ahcisata0 channel 0 atabus1 at ahcisata0 channel 1 atabus2 at ahcisata0 channel 2 atabus3 at ahcisata0 channel 3 ... ahcisata0 port 0: device present, speed: 6.0Gb/s ahcisata0 port 3: PHY offline ahcisata0 port 1: PHY offline ahcisata0 port 2: PHY offline ... wd0 at atabus0 drive 0 wd0: wd0: drive supports 16-sector PIO transfers, LBA48 addressing wd0: 1863 GB, 3876021 cyl, 16 head, 63 sec, 512 bytes/sect x 3907029168 sectors (0 bytes/physsect; first aligned sector: 8) wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), NCQ (32 tags) w/PRIO wd0(ahcisata0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) (using DMA), NCQ (31 tags) w/PRIO I don't suppose this gives anyone any helpful thoughts? /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
re: 9.1: boot-time delay?
> > +optionsHZ=8000 this can become a problem due to integer division. any number of ticks less than hz (8000) will be rounded down to 0 in a number of places now, where as before it was only less than 100. i've seen this trip up in the kernel before, and sometimes that '0' means 'poll', and sometimes it means 'sleep forever'. a lot of places in the kernel *do* avoid (eg, with adding hz-1 and then dividing by hz) but there are a number that do not... .mrg.
Re: 9.1: boot-time delay?
> Is there a reason you need a frequency that high? Yes. I'm not just cranking HZ up to be difficult. (I might do it just to smoke out bugs...on my _own_ machines. Except, of course, that I don't use 9.x except at work, where the tradeoffs are different.) > Will HZ=1000 be sufficient and does that reduce the boot time? The former is "no", in the sense that it will not be suitable for production use on the turnkey system in question. See below. The latter is a good question which is likely to hint at possible causes. I'll experiment with various HZ values and see what happens. > Also,while I don't entirely understand all the timing mechanisms > inside NetBSD, it seems if you do need a higher frequency clock, I'd > suggest HZ=1, since it's a factor of 100 times the default, > rather than an odd 80 times the default. Unfortunately a value that is not a multiple of 8000 is not suitable. (Why would 100 be better than 80, by the way? I see no reason other than a human-layer bias towards powers of 10, so I'm wondering if that's all it is or if there's something I'm missing.) In this application, userland computes values and pushes them, via a parallel port, to external D->A hardware. (Among other things. Userland does quite a lot else, too; I am outlining just the stuff relevant to HZ=8000.) They need to be pushed at fairly precisely 8KHz to get the desired result out of the D->A, but they are produced burstily (IIRC, 4KHz bursts of two values each are most common). The cleanest solution would probably be an external FIFO with an external 8KHz clock. But, for reasons such as installed hardware base, addressing it in software is preferred if it works. (Getting customers to just install new software is much cheaper for the company than forklift upgrades would be.) I added code to the lp driver to buffer values in the kernel and push them out based on clock ticks. This works relatively well; the only downside we have uncovered is the boot-time delay, and even that surprises me. Doing it in software isn't perfect. There is still the occasional glitch in timing when the relevant interrupts are locked out. But, so far, those glitches have been a lower price than the external hardware. I was rather grasping at straws when I tried changing HZ to see if it fixed it; I was in somewhat of a "when you've eliminated the impossible..." place after removing all three video drivers (i915drmkms/intelfb, radeon/radeondrmkmsfb, and nouveau/nouveaufb) and finding the delay still present. I tried HZ next because there were very few changes between the two kernels, and HZ looked like the most likely of the ones remaining after the video drivers didn't fix it. I then tried the HZ change with all the video drivers present and it still was fast with no HZ option but slow with HZ=8000. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: 9.1: boot-time delay?
Hello. Is there a reason you need a frequency that high? Will HZ=1000 be sufficient and does that reduce the boot time? Also,while I don't entirely understand all the timing mechanisms inside NetBSD, it seems if you do need a higher frequency clock, I'd suggest HZ=1, since it's a factor of 100 times the default, rather than an odd 80 times the default. -thanks -Brian On May 25, 9:44am, Mouse wrote: } Subject: Re: 9.1: boot-time delay? } Last week, I wrote, here, of a delay when booting 9.1 } } >>> [ 3.288539] uhub2: 4 ports with 4 removable, self powered } >>> [ 3.288539] uhub3: 6 ports with 6 removable, self powered } >>> [25.272567] wd0 at atabus0 drive 0 } >>> [25.273568] wd0: } } and, in a later mail, } } > [A]s soon as I sent my mail and started looking at subsetting diffs, } > I discovered the diffs between the installer kernel and the } > operational kernel were far smaller than I remembered (and mostly } > irrelevant): } } It looks as though this, of all things, is the relevant line: } } > +optionsHZ=8000 } } My first reaction is that something somewhere is delaying by a fixed } number of ticks rather than doing arithmetic with hz, but that would } make the difference go in the other direction. I'm having trouble } coming up with a plausible scenario that would explain the delay } getting _longer_ with HZ=8000. The most plausible thing I've thought } of is that some delay is being computed based on hz and then getting } treated as milliseconds instead of ticks or some such and getting } multiplied by hz again. } } Anyone have any thoughts on possible ways to track this one down? I'm } going to be doing what I can, but any thoughts anyone has would be } welcomed; I am only minimally familiar with 9.1's kernel internals. } } /~\ The ASCII Mouse } \ / Ribbon Campaign } X Against HTML mo...@rodents-montreal.org } / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B >-- End of excerpt from Mouse
Re: 9.1: boot-time delay?
Last week, I wrote, here, of a delay when booting 9.1 >>> [ 3.288539] uhub2: 4 ports with 4 removable, self powered >>> [ 3.288539] uhub3: 6 ports with 6 removable, self powered >>> [25.272567] wd0 at atabus0 drive 0 >>> [25.273568] wd0: and, in a later mail, > [A]s soon as I sent my mail and started looking at subsetting diffs, > I discovered the diffs between the installer kernel and the > operational kernel were far smaller than I remembered (and mostly > irrelevant): It looks as though this, of all things, is the relevant line: > +options HZ=8000 My first reaction is that something somewhere is delaying by a fixed number of ticks rather than doing arithmetic with hz, but that would make the difference go in the other direction. I'm having trouble coming up with a plausible scenario that would explain the delay getting _longer_ with HZ=8000. The most plausible thing I've thought of is that some delay is being computed based on hz and then getting treated as milliseconds instead of ticks or some such and getting multiplied by hz again. Anyone have any thoughts on possible ways to track this one down? I'm going to be doing what I can, but any thoughts anyone has would be welcomed; I am only minimally familiar with 9.1's kernel internals. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B