Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Michael van Elst
On Fri, May 28, 2021 at 05:13:02AM +0700, Robert Elz wrote:

> But it isn't, you can't convert 60 ticks/second into some number of
> milliseconds, the two are different units.

Sure you can. period = 1/frequency.

But HZ (and now hz) in the code is also used as a period, thus the idioms:

delay(5*HZ)wait 5 seconds
timeout(ttrstrt, tp, HZ/4) trigger ttrstrt in 1/4 second

That's the simplicity when your time units are 1/HZ, then the same
number (called HZ) can stand for a frequency and a one second period.


If the programming language would use measures (number + specific
unit) instead of (integer) numbers, it became more explicit. But
I guess a modern language would still allow to use the same value
as a frequency or a period with some automatic type conversion.


Can we go back now to the original problem ?


Greetings,
-- 
Michael van Elst
Internet: mlel...@serpens.de
"A potential Snark may lurk in every tree."


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Johnny Billquist

On 2021-05-28 00:46, Joerg Sonnenberger wrote:

On Fri, May 28, 2021 at 03:14:24AM +0700, Robert Elz wrote:

 Date:Thu, 27 May 2021 05:05:15 - (UTC)
 From:mlel...@serpens.de (Michael van Elst)
 Message-ID:  

   | mlel...@serpens.de (Michael van Elst) writes:
   |
   | >Either direction mstohz or hztoms should better always round up to
   | >guarantee a minimal delay.
   |
   | And both should be replaced by hztous()/ustohz().

While changing ms to us is probably a good idea, when a change happens,
the "hz" part should be changed too.

hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a
measure of time (duration) - converting one to the other makes no sense.


"hz" in this context comes from HZ - it is used here as dimension of
1s/HZ. Just like "ms" here is used as "1s/1000". It's a pretty sensible
naming compared to much longer naming variants.


Well, Robert have a point.

If you say hztoms(4), we are not asking for a conversion from 4Hz to ms, 
but in fact four ticks at the current HZ, converted to ms.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Joerg Sonnenberger
On Fri, May 28, 2021 at 03:14:24AM +0700, Robert Elz wrote:
> Date:Thu, 27 May 2021 05:05:15 - (UTC)
> From:mlel...@serpens.de (Michael van Elst)
> Message-ID:  
> 
>   | mlel...@serpens.de (Michael van Elst) writes:
>   |
>   | >Either direction mstohz or hztoms should better always round up to
>   | >guarantee a minimal delay.
>   |
>   | And both should be replaced by hztous()/ustohz().
> 
> While changing ms to us is probably a good idea, when a change happens,
> the "hz" part should be changed too.
> 
> hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a
> measure of time (duration) - converting one to the other makes no sense.

"hz" in this context comes from HZ - it is used here as dimension of
1s/HZ. Just like "ms" here is used as "1s/1000". It's a pretty sensible
naming compared to much longer naming variants.

Joerg


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Johnny Billquist

On 2021-05-28 00:13, Robert Elz wrote:

 Date:Thu, 27 May 2021 20:19:06 +
 From:"Koning, Paul" 
 Message-ID:  <8765ae3a-b5b7-4b67-82ce-93473a5b9...@dell.com>

   | In this particular case it's converting frequency to period,
   | that is a sensible conversion.

But it isn't, you can't convert 60 ticks/second into some number of
milliseconds, the two are different units.


Not that much.
To quote wikipedia:

"The dimension of the unit hertz is 1/time (1/T). Expressed in base SI 
units it is 1/second (1/s)."


So basically, it's just the inverse of time. Based on that, it's pretty 
clear that conversion to/from time is very valid.



And in another reply:

Johnny Billquist  said:
   | Frequency essentially means a counting of the number of  time something
   | happens over a specific time period. With hertz, the time  period is one
   | second.
  
Sure.


   | So then converting the number of times an event
   | happens in a second into how long it is between two events makes total
   | sense.

It would, but that's not what the functions do.   What they do is tell
how many ticks occur in a specific number of milliseconds (or vice
versa).   Your calculation is just (in milliseconds) 1000/hz, and assuming
hz isn't varying, is a constant.


Good point. It's a conversion from ms (or whatever time) to/from the 
number of cycles (or ticks of you want) based on the frequency given.


I didn't think this through enough. While I certainly believe it's 
perfectly valid to convert between a frequency and a time for a single 
cycle, it do become weird to talk about frequency if we are in fact 
talking about some specific number of cycles, although those cycles can 
only be converted to a time if we have a frequency.


I guess you convinced me. We should really call it something like 
tickstoms and mstoticks. But that do rely on people then understanding 
that it is the ticks defined by HZ, and not any random ticks.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Robert Elz
Date:Thu, 27 May 2021 20:19:06 +
From:"Koning, Paul" 
Message-ID:  <8765ae3a-b5b7-4b67-82ce-93473a5b9...@dell.com>

  | In this particular case it's converting frequency to period,
  | that is a sensible conversion.

But it isn't, you can't convert 60 ticks/second into some number of
milliseconds, the two are different units.   That's just the same as
you can't convert metres/second (velocity) into seconds.   Given a
particular velocity, and some number of metres, you can calculate the
time it takes to move that far, but that isn't converting velocity
into seconds.

What it is happening is that (in one direction of the other, depending
upon which function) it is converting between the number of ticks that
occur and the duration of an interval (which of course depends upon the
frequency, but it is not converting the frequency).

The hztoms() function is no different than a ustoms() function, except
that in the former we have a semi-variable (the frequency) which is simply
a constant (1000) in the second - but that's only a variable because we
allow HZ to vary (by architecture, and sometimes, configuration).   Calling
ustoms() thousandtoms() would be absurd.   So is calling this one hztoms().

  | You could say "hztoperiodinus" but that's rather verbose.

That doesn't help, we're still not converting a frequency to a period.

And in another reply:

Johnny Billquist  said:
  | Frequency essentially means a counting of the number of  time something
  | happens over a specific time period. With hertz, the time  period is one
  | second.
 
Sure.

  | So then converting the number of times an event 
  | happens in a second into how long it is between two events makes total 
  | sense.

It would, but that's not what the functions do.   What they do is tell
how many ticks occur in a specific number of milliseconds (or vice
versa).   Your calculation is just (in milliseconds) 1000/hz, and assuming
hz isn't varying, is a constant.

  | A tick is not a duration. A tick is a specific event at a specific time.  It
  | has no duration. You have a duration between two ticks.

Sure, reasonable point, but as Mouse said, when we're dealing with this
stuff the number of ticks is counted as a representation of the number
of those durations, and we just say how many ticks happened.  The ticks
represent the duration between them - that might be slightly sloppy, but
it isn't outright wrong.

kre




Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Johnny Billquist

On 2021-05-27 22:50, Mouse wrote:

A tick is not a duration.  A tick is a specific event at a specific
time.  It has no duration.  You have a duration between two ticks.


At least as I use it and have heard it used, `tick' can also be used to
refer to the interval between two of those events.  "How long are timer
ticks on this hardware?" or "For the next few ticks, we crunch on
this...".


I think that is mostly a bit sloppy terminology where they actually want 
to know the time between two ticks. :-)


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Mouse
> A tick is not a duration.  A tick is a specific event at a specific
> time.  It has no duration.  You have a duration between two ticks.

At least as I use it and have heard it used, `tick' can also be used to
refer to the interval between two of those events.  "How long are timer
ticks on this hardware?" or "For the next few ticks, we crunch on
this...".

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Johnny Billquist

On 2021-05-27 22:14, Robert Elz wrote:

 Date:Thu, 27 May 2021 05:05:15 - (UTC)
 From:mlel...@serpens.de (Michael van Elst)
 Message-ID:  

   | And both should be replaced by hztous()/ustohz().

While changing ms to us is probably a good idea, when a change happens,
the "hz" part should be changed too.


Not sure I agree with that.


hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a
measure of time (duration) - converting one to the other makes no sense.


It sure does. Frequency essentially means a counting of the number of 
time something happens over a specific time period. With hertz, the time 
period is one second. So then converting the number of times an event 
happens in a second into how long it is between two events makes total 
sense.



What these functions/macros do is convert between ms (or us) and ticks
(another measure of a duration), not hz, so the misleading "hz" part of
the name should be removed (changed) if a new macro/function is to be invented.
(The benefit isn't worth it to justify changing the current existing names,
but we shouldn't persist with nonsense if we're doing something new.)


A tick is not a duration. A tick is a specific event at a specific time. 
It has no duration. You have a duration between two ticks.


And commonly at every tick you get an interrupt, and you do something, 
and then resume your normal work. The tick has passed. Waiting for the 
next one to happen.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Koning, Paul



> On May 27, 2021, at 4:14 PM, Robert Elz  wrote:
> 
> ...
> While changing ms to us is probably a good idea, when a change happens,
> the "hz" part should be changed too.
> 
> hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a
> measure of time (duration) - converting one to the other makes no sense.

In this particular case it's converting frequency to period, that is a sensible 
conversion.  You could say "hztoperiodinus" but that's rather verbose.

paul




Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Robert Elz
Date:Thu, 27 May 2021 05:05:15 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:  

  | mlel...@serpens.de (Michael van Elst) writes:
  |
  | >Either direction mstohz or hztoms should better always round up to
  | >guarantee a minimal delay.
  |
  | And both should be replaced by hztous()/ustohz().

While changing ms to us is probably a good idea, when a change happens,
the "hz" part should be changed too.

hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a
measure of time (duration) - converting one to the other makes no sense.

What these functions/macros do is convert between ms (or us) and ticks
(another measure of a duration), not hz, so the misleading "hz" part of
the name should be removed (changed) if a new macro/function is to be invented.
(The benefit isn't worth it to justify changing the current existing names,
but we shouldn't persist with nonsense if we're doing something new.)

kre

ps: note that the variable "hz" (and the macro HZ) are used correctly --
their values are frequencies (ticks/second).



Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-27 Thread Mouse
>> How heavily is hztoms used?

> [18 uses]

That's...almost none, seems to me.  And these

> sys/dev/ic/mvsata.c:  timeout = mstohz(timeout + hztoms(1) - 1);
> sys/dev/ic/mvsata.c:  ata_delay(chp, hztoms(1), "mvsata_edma2", 
> wflags);
> sys/dev/sdmmc/if_bwfm_sdio.c: sdmmc_pause(hztoms(1)*1000, NULL);

look to me like bugs waiting to happen - and the first one looks as
though it might even be related to the bug that got us talking about
this; it changes the units of the number in timeout (from ms to hz) in
just the sort of way that could have led to what I saw.

The last, it seems to me, really should be hztoms(1000).

And these

> sys/dev/i2c/tsllux.c: if (ms < hztoms(1)) {
> sys/dev/pci/ixgbe/ixgbe_netbsd.c: else if ((us / 1000) >= hztoms(1)) {

might be as well; I'd have to read more context.

> sys/dev/usb/if_axe.c: usbd_delay_ms(sc->axe_un.un_udev, hztoms(y));   
> \
> sys/external/bsd/drm2/include/linux/jiffies.h:return hztoms(j);
> sys/external/bsd/drm2/include/linux/sched.h:  unsigned ms 
> = hztoms(MIN((unsigned long)timeout,
> sys/kern/sched_4bsd.c:int rttsms = hztoms(sched_rrticks);
> sys/kern/sched_m2.c:  int rttsms = hztoms(sched_rrticks);
> sys/kern/sched_m2.c:  newsize = hztoms(min_ts);
> sys/kern/sched_m2.c:  newsize = hztoms(max_ts);

These might be as well, but they look at least slightly more
reasonable.

> sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
> sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
> sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
> sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));

And these are not only bugs waiting to happen (consider HZ=25), they're
extremely peculiar.  Why hztoms(hz/32) rather than just 30, or 31, or
whatever?  Weird.

Would it be useful for me to send-pr this, or is it more likely to be
something nobody would do anything with?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Michael van Elst
mlel...@serpens.de (Michael van Elst) writes:

>Either direction mstohz or hztoms should better always round up to
>guarantee a minimal delay.

And both should be replaced by hztous()/ustohz().

Microseconds allow a time value of ~35 minutes as 32bit signed
integer, which should be a safe range for delays.

A higher resolution might have its use too, but then you have
to assure that such values are always carried in 64bit integers
or handled by a structure. There is already a hz2bintime()
macro in that realm, which is unsafe (truncates to milliseconds)
but AFAIK also unused.



Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Michael van Elst
mo...@rodents-montreal.org (Mouse) writes:

>How heavily is hztoms used?  (I would expect mstohz to be used far more
>heavily.)

sys/dev/acpi/acpi_cpu_cstate.c:   sc->sc_cstate_sleep = 
hztoms(acpitimer_delta(end, start)) * 1000;
sys/dev/spkr_audio.c: audiobell(sc->sc_audiodev, xhz, hztoms(ticks), 
sc->sc_spkr.sc_vol, 0);
sys/dev/i2c/tsllux.c: if (ms < hztoms(1)) {
sys/dev/ic/mvsata.c:  timeout = mstohz(timeout + hztoms(1) - 1);
sys/dev/ic/mvsata.c:  ata_delay(chp, hztoms(1), "mvsata_edma2", 
wflags);
sys/dev/pci/ixgbe/ixgbe_netbsd.c: else if ((us / 1000) >= hztoms(1)) {
sys/dev/sdmmc/if_bwfm_sdio.c: sdmmc_pause(hztoms(1)*1000, NULL);
sys/dev/usb/if_axe.c: usbd_delay_ms(sc->axe_un.un_udev, hztoms(y)); 
  \
sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
sys/dev/usb/if_axe.c: usbd_delay_ms(un->un_udev, hztoms(hz / 32));
sys/external/bsd/drm2/include/linux/jiffies.h:return hztoms(j);
sys/external/bsd/drm2/include/linux/sched.h:  unsigned ms = 
hztoms(MIN((unsigned long)timeout,
sys/kern/sched_4bsd.c:int rttsms = hztoms(sched_rrticks);
sys/kern/sched_m2.c:  int rttsms = hztoms(sched_rrticks);
sys/kern/sched_m2.c:  newsize = hztoms(min_ts);
sys/kern/sched_m2.c:  newsize = hztoms(max_ts);


>> 0 can sometimes mean "never block" and sometimes can mean "block
>> forever".

>What does hztoms's return value get used for?  Is it actually used to
>compute delay values?  It sounds dodgy to me to do so, for this among
>other reasons.

Mostly to compute a delay.

Either direction mstohz or hztoms should better always round up to
guarantee a minimal delay.



Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Mouse
> #  define hztoms(t) ((unsigned int)(((t) + 0ul) * 1000ul / hz))

> when hz > 1000, this returns 0 for input of 1.

True, which is correct - when hz > 1000, one tick is less than one
millisecond.

If, that is, it's defined to round down (and, if not, the
implementation you quote is broken).  What does its interface contract
say?  (I don't have easy access to 9.1 at the moment, or I'd go
looking.)

How heavily is hztoms used?  (I would expect mstohz to be used far more
heavily.)

> 0 can sometimes mean "never block" and sometimes can mean "block
> forever".

What does hztoms's return value get used for?  Is it actually used to
compute delay values?  It sounds dodgy to me to do so, for this among
other reasons.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread matthew green
> > Me too.  I was - am - rather puzzled by it.
>
> Right. That was my issue. Claiming that you'd get more problems with 
> rounding and issues with coarsness when running at a higher frequency, 
> when it in fact is the exact opposite. With a higher frequency you have 
> more accuracy and less errors from truncations.
>
> Anyway, I have no idea what the actual problem is. Good luck with that part.

as i posted earlier in this thread:

#  define hztoms(t) ((unsigned int)(((t) + 0ul) * 1000ul / hz))

when hz > 1000, this returns 0 for input of 1.  0 can sometimes
mean "never block" and sometimes can mean "block forever".


.mrg.


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Johnny Billquist

On 2021-05-26 12:59, Mouse wrote:

I don't fully understand the discussion here.



Initially people claimed that HZ at 8000 would be a problem,


Well, my experience indicates that it _is_ a problem, at least when
using disks at piixide (or pciide).


Right. But not for the reason suggested. But obviously there is some 
kind of a problem somewhere.



which for me seems a bit backwards.


Me too.  I was - am - rather puzzled by it.


Right. That was my issue. Claiming that you'd get more problems with 
rounding and issues with coarsness when running at a higher frequency, 
when it in fact is the exact opposite. With a higher frequency you have 
more accuracy and less errors from truncations.


Anyway, I have no idea what the actual problem is. Good luck with that part.

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Mouse
> I don't fully understand the discussion here.

> Initially people claimed that HZ at 8000 would be a problem,

Well, my experience indicates that it _is_ a problem, at least when
using disks at piixide (or pciide).

> which for me seems a bit backwards.

Me too.  I was - am - rather puzzled by it.

> I can only see two real problems with a high clock frequency:

> 1. The overhead of just dealing with clock interrupts increase.

Yes.  This is a potential issue.  When I set HZ=8000 it was one of the
first things I looked at.  The extra interrupt handling load is
measurable, but small enough that it was an acceptable price for making
the application work.  IIRC it was on the order of 1-2% of one core.

> 2. If there are things that just give time in ticks, then ticks
> become very short.  And if the assumption is that just one tick is
> enough, such an assumption can fail.

Yes.  But I have trouble seeing how such a failure could lead to a
delay that is (approximately) linear in HZ.  The only thing I've been
able to think of is that some delay somehow is having mstohz(), or
moral equivalent, applied to it twice.  That is, mstohz is applied, and
then some other code assumes the result is ms rather than ticks and
applies mstohz to it again.  At HZ=100 this means you get a tenth the
delay you think you're getting.  At HZ=1000 it's what it should be.  At
HZ=8000 it's eight times what it should be.  Based on that theory, I
would suspect someone has written a delay number that should be three
seconds and is actually getting .3 seconds at HZ=100, going up to 6s at
HZ=2000, 12s at 4000, and 24s at 8000, but about 2s of other stuff
happens in parallel with it, so I see the delay as 2s shorter than it
is.  I could easily see myself thinking "weird, I have to ask for
3000ms to get a delay long enough for this to work, doesn't _look_ like
three seconds, must investigate sometime" and never getting around to
actually investigating.

My principal reason for thinking this is so is that when I disabled
USB, the delay (at HZ=8000) went from 22s to about 24s - but the
absolute time at which the delay ended, the number printed in brakcets,
stayed (approximately) constant at about 25.3.  This matches well with
the theory that a delay of about 24s starts early and runs in parallel
with the rest of autoconf, and USB takes about two seconds, so I see
22s of delay.  But then when I disable USB, it doesn't hide 2s of that
24s, so I see the whole thing.  (I'd suspect 25s of delay except that
the numbers in brackets seem to start at 1.  Also, 24s matches better
with something getting multiplied by 80 than 25 does.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Johnny Billquist

On 2021-05-26 11:12, matthew green wrote:

Manuel Bouyer writes:

On Wed, May 26, 2021 at 05:35:53PM +1000, matthew green wrote:

Manuel Bouyer writes:

On Tue, May 25, 2021 at 10:46:04PM -, Michael van Elst wrote:

bou...@antioche.eu.org (Manuel Bouyer) writes:


Another issue could be mstohz() called with a delay too short;
mstohz() will round it up to 1 tick.



#  define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul))


actually, this one is problematic as well.

mstohz(1) here gives 0 for hz < 1000.  "(1 + 0) * hz / 1000"
for any 'hz' less than 1000 will give 0.


I don't fully understand the discussion here.
Initially people claimed that HZ at 8000 would be a problem, which for 
me seems a bit backwards. And this comment should make that even more 
obvious.


With hz at 8000, you actually get some usable value for mstohz(1), while 
with low hz definitions, you do not.


So why would a high frequency at a clock be a problem? Seems the person 
who claimed that must have gotten things a bit backwards.


I can only see two real problems with a high clock frequency:
1. The overhead of just dealing with clock interrupts increase.
2. If there are things that just give time in ticks, then ticks become 
very short. And if the assumption is that just one tick is enough, such 
an assumption can fail.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread matthew green
Manuel Bouyer writes:
> On Wed, May 26, 2021 at 05:35:53PM +1000, matthew green wrote:
> > Manuel Bouyer writes:
> > > On Tue, May 25, 2021 at 10:46:04PM -, Michael van Elst wrote:
> > > > bou...@antioche.eu.org (Manuel Bouyer) writes:
> > > > 
> > > > >Another issue could be mstohz() called with a delay too short;
> > > > >mstohz() will round it up to 1 tick.
> > > > 
> > > > 
> > > > #  define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul))

actually, this one is problematic as well.

mstohz(1) here gives 0 for hz < 1000.  "(1 + 0) * hz / 1000"
for any 'hz' less than 1000 will give 0.

> > it's hztoms() that is the problem here.
> > 
> > #  define hztoms(t) ((unsigned int)(((t) + 0ul) * 1000ul / hz))
> > 
> > ah... this is the new one.  the old one was:
> > 
> > #define hztoms(t) \
> >  (__predict_false((t) >= 0x2) ? \
> >  ((t +0u) / hz) * 1000u : \
> >  ((t +0u) * 1000u) / hz)
> > 
> > looks like christos fixed it in 2019.
>
> I'm not sure how christos's change could be a fix. I introduced hztoms()
> and mstohz() to avoid interger overflow for large values, and it looks like
> christos reintroduced it for 32bits platforms :(

there's an inline for 32 bit now, not the above code.

i realise the problem i recall here:  it happens with both
of the above with hz is > 1000.  "1 * 1000 / 1024" is 0.


.mrg.


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Manuel Bouyer
On Wed, May 26, 2021 at 05:35:53PM +1000, matthew green wrote:
> Manuel Bouyer writes:
> > On Tue, May 25, 2021 at 10:46:04PM -, Michael van Elst wrote:
> > > bou...@antioche.eu.org (Manuel Bouyer) writes:
> > > 
> > > >Another issue could be mstohz() called with a delay too short;
> > > >mstohz() will round it up to 1 tick.
> > > 
> > > 
> > > #  define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul))
> > > 
> > > If mstohz() would round up to full ticks, it could actually avoid
> > > some pitfalls. But it doesn't.
> >
> > indeed. But in this case the problem will show up with smaller hz, not
> > larger. So I think we can rule out this case.
> 
> it's hztoms() that is the problem here.
> 
> #  define hztoms(t) ((unsigned int)(((t) + 0ul) * 1000ul / hz))
> 
> ah... this is the new one.  the old one was:
> 
> #define hztoms(t) \
>  (__predict_false((t) >= 0x2) ? \
>  ((t +0u) / hz) * 1000u : \
>  ((t +0u) * 1000u) / hz)
> 
> looks like christos fixed it in 2019.

I'm not sure how christos's change could be a fix. I introduced hztoms()
and mstohz() to avoid interger overflow for large values, and it looks like
christos reintroduced it for 32bits platforms :(

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread matthew green
Manuel Bouyer writes:
> On Tue, May 25, 2021 at 10:46:04PM -, Michael van Elst wrote:
> > bou...@antioche.eu.org (Manuel Bouyer) writes:
> > 
> > >Another issue could be mstohz() called with a delay too short;
> > >mstohz() will round it up to 1 tick.
> > 
> > 
> > #  define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul))
> > 
> > If mstohz() would round up to full ticks, it could actually avoid
> > some pitfalls. But it doesn't.
>
> indeed. But in this case the problem will show up with smaller hz, not
> larger. So I think we can rule out this case.

it's hztoms() that is the problem here.

#  define hztoms(t) ((unsigned int)(((t) + 0ul) * 1000ul / hz))

ah... this is the new one.  the old one was:

#define hztoms(t) \
 (__predict_false((t) >= 0x2) ? \
 ((t +0u) / hz) * 1000u : \
 ((t +0u) * 1000u) / hz)

looks like christos fixed it in 2019.

revision 1.615
date: 2019-09-28 08:10:58 -0700;  author: christos;  state: Exp;  lines: +25 
-12;  commitid: QCMKDk7BwqyE4NEB;
For 32 bit the mstohz and hztoms functions evaluate their parameter multiple
times. This is inefficient for cases like:
unsigned ms = hztoms(MIN(timeout, mstohz(INT_MAX)));
Make them inline functions; also provide the 64 bit versions for them here
so all the LP64 machines can use them (before only amd64 and sparc64
specialized mstohz).
Make them both return unsigned int.

which explains why the change i made in mid-2019 that seemed to no
longer be necessary was no longer necessary!


.mrg.


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-26 Thread Manuel Bouyer
On Tue, May 25, 2021 at 10:46:04PM -, Michael van Elst wrote:
> bou...@antioche.eu.org (Manuel Bouyer) writes:
> 
> >Another issue could be mstohz() called with a delay too short;
> >mstohz() will round it up to 1 tick.
> 
> 
> #  define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul))
> 
> If mstohz() would round up to full ticks, it could actually avoid
> some pitfalls. But it doesn't.

indeed. But in this case the problem will show up with smaller hz, not
larger. So I think we can rule out this case.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-25 Thread Michael van Elst
bou...@antioche.eu.org (Manuel Bouyer) writes:

>Another issue could be mstohz() called with a delay too short;
>mstohz() will round it up to 1 tick.


#  define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul))

If mstohz() would round up to full ticks, it could actually avoid
some pitfalls. But it doesn't.



Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-25 Thread Manuel Bouyer
On Tue, May 25, 2021 at 04:04:56PM -0400, Mouse wrote:
> > I suppose it's not possible to configure ahcisata in the BIOS on the
> > long-delay machines?
> 
> Thank you very much!  Yes.  That is possible - and it fixes the delay.
> I would not have thought to look for that; I would not have expected
> piixide and ahcisata to be similar enough that a BIOS setting could
> personality-swap between them.
> 
> > I'm guessing this is some quirk of the pciide(4) and piixide(4)
> > drivers.
> 
> Sounds like; they presumably have a bug somewhere in some delay
> calculation.  But at least I have something approaching a workaround.

The reset and probe procedure is different bewteen ide and ahci. 
The problem is probably in this area.

Actually the root cause may be a delay too short, not too long.
AFAIK the code uses mstohz() everywhere to compute tick values, exept
maybe a few cases where we want a really short delay and we use 1.
This is where a very high HZ may cause a too short delay.
Another issue could be mstohz() called with a delay too short;
mstohz() will round it up to 1 tick.

But I think you will need to instrument the ide probe in dev/ic/wdc.c.
It's been a while since I last looked at this, but I think the code
you want to look at is wdc_drvprobe().

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-25 Thread Mouse
>>> I suppose it's not possible to configure ahcisata in the BIOS on
>>> the long-delay machines?
>> I would not have thought to look for that; I would not have expected
>> piixide and ahcisata to be similar enough that a BIOS setting could
>> personality-swap between them.
> IIRC, they are not really that similar.  But the dmesg excerpt you
> cited had piixide at *Intel 6 Series Serial ATA Controller*.  So,
> clearly, the device had to support ahci mode.  Well, "clearly" for
> those in the know.

Indeed!

Still, I wouldn't've thought to look for it because I wasn't aware
there was *any* hardware that could personality-flip like that.  For
someone familiar with that hardware, or with other hardware that can
do it, it'd be a reasonable thing - I guess those are your "those in
the know"!

Which didn't include me.  At least until now. :-)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-25 Thread Christoph Badura
On Tue, May 25, 2021 at 04:04:56PM -0400, Mouse wrote:
> > I suppose it's not possible to configure ahcisata in the BIOS on the
> > long-delay machines?
> 
> Thank you very much!  Yes.  That is possible - and it fixes the delay.
> I would not have thought to look for that; I would not have expected
> piixide and ahcisata to be similar enough that a BIOS setting could
> personality-swap between them.

IIRC, they are not really that similar.  But the dmesg excerpt you cited
had piixide at *Intel 6 Series Serial ATA Controller*.  So, clearly, the
device had to support ahci mode.  Well, "clearly" for those in the know.

--chris


Re: 9.1: boot-time delay? [WORKAROUND FOUND]

2021-05-25 Thread Mouse
> I suppose it's not possible to configure ahcisata in the BIOS on the
> long-delay machines?

Thank you very much!  Yes.  That is possible - and it fixes the delay.
I would not have thought to look for that; I would not have expected
piixide and ahcisata to be similar enough that a BIOS setting could
personality-swap between them.

> I'm guessing this is some quirk of the pciide(4) and piixide(4)
> drivers.

Sounds like; they presumably have a bug somewhere in some delay
calculation.  But at least I have something approaching a workaround.

> Not to be too flip, but do you expect these machines to reboot
> frequently in production?

Often enough that my colleague on the customer-interface side of things
thinks twenty seconds of delay on reboot is a problem.

> If not, then I'd probably live with the delay on reboot as at this
> point, [...]

Me too.  I didn't even realize it was there until they pointed it out
to me; I don't really notice twenty seconds more delay when the typical
peecee BIOS imposes anywhere from thirty seconds to five minutes before
it even starts to boot.  I'm obviously not part of their target market
for this thing.

> I'd be concerned that any fix I came up with for it would have
> implications down the road which might be more serious and more
> impactful on operations.

Fair concern.

> I certainly understand the need to know what's going on, but if a
> machine only reboots once or twice a year in production, then ...

Also fair.  But I think once a day is the very fewest reboots we're
likely to see for this, though I'm not really all that in touch with
the user base, so

I'll push this over to them and see if it's an acceptable workaround to
poke the BIOS settings on the existing installed base.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B