Re: PSA: Clock drift and pkgin
I believe you. But aren't we now getting into pretty realtime stuff? Not sure NetBSD is at all suitable for such applications/environments. As you say - if this only behaves acceptably if the system is not under load, then it's not a solution I would go for. But again - I guess we're talking very personal opinions/experiences/tradeoffs here. I still don't really believe we need very high resolution scheduling. And tickless is sortof a separate topic to this. Johnny On 2023-12-31 05:20, Konrad Schroder wrote: On 12/30/2023 3:42 PM, Johnny Billquist wrote: On 2023-12-31 00:11, Michael van Elst wrote: Better than 100Hz is possible and still precise. Something around 1000Hz is necessary for human interaction. Modern hardware could easily do 100kHz. ? If I remember right, anything less than 200ms is immediate response for a human brain. Which means you can get away with much coarser than even 100Hz. And there are certainly lots of examples of older computers with clocks running in the 10s of ms, where human interaction feels perfect. I'm not sure about visual and auditory sensation, but haptic VR requires position updates >= 1000Hz to get texture right. The timing of two impulses that close together may not be felt as two separate events, but the frequency of vibrations within the skin when it interacts with a surface (even through a tool, such as a stylus) is encoded by the nerve endings in the skin itself. We used to use PHANTOM haptic arms at $WORK, driven by an Indigo2. If the control loop operated at less than 1000Hz---for example, if the Indigo2 was under load--- it introduced noticeable differences in the sensation of running the pen over a virtual object. The simulation was much more sensitive to that than it was to the timing of the video output, for which anything greater than 72Hz was wasted. Take care, -Konrad -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Perceivable time differences [was Re: PSA: Clock drift and pkgin]
Ok. I oversimplified. If I remember right, the point was that something sub 200ms is perceived by the brain as being "instananeous" response. It don't mean that one cannot discern shorter times, just that from an action-reaction point of view, anything below 200ms is "good enough". My point was merely that I don't believe you need to have something down to ms resolution when it comes to human interaction, which was the claim I reacted to. Johnny On 2023-12-31 02:47, Mouse wrote: ? If I remember right, anything less than 200ms is immediate response for a human brain. "Response"? For some purposes, it is. But under the right conditions humans can easily discern time deltas in the sub-200ms range. I just did a little psychoacoustics experiment on myself. First, I generated (44.1kHz) soundfiles containing two single-sample ticks separated by N samples for N being 1, 101, 201, 401, 801, and going up by 800 from there to 6401, with a second of silence before and after (see notes below for the commands used): for d in 0 100 200 400 800 1600 2400 3200 4000 4800 5600 6400 do (count from 0 to 44100 | sed -e "s/.*/0 0 0 0/" echo 0 128 0 128 count from 0 to $d | sed -e "s/.*/0 0 0 0/" echo 0 128 0 128 count from 0 to 44100 | sed -e "s/.*/0 0 0 0/" ) | code-to-char > zz.$d done I don't know stock NetBSD analogs for count and code-to-char. count, as used here, just counts as the command line indicates; given what count's output is piped into, the details don't matter much. code-to-char converts numbers 0..255 into single bytes with the same values, with non-digits ignored except that they serve to separate numbers. (The time delta between the beginnings of the two ticks is of course one more than the number of samples between the two ticks.) After listening to them, I picked the 800 and 1600 files and did the test. I grabbed 128 bits from /dev/urandom and used them to play, randomly, either one file or the other, letting me guess which one it was in each case: dd if=/dev/urandom bs=1 count=16 | char-to-code | cvtbase -m8 d b | sed -e 's/./& /g' -e 's/ $//' -e 's/0/800/g' -e 's/1/1600/g' | tr \ \\n | ( exec 3>zz.list 4>zz.guess 5&3 audioplay -f -c 2 -e slinear_le -P 16 -s 44100 < zz.$n skipcat 0 1 0<&5 1>&4 done ) char-to-code is the inverse of code-to-char: for each byte of input, it produces one line of output containing the ASCII decimal for that byte's value, 0..255. cvtbase -m8 d b converts decimal to binary, generating a minimum of 8 "digits" (bits) of output for each input number. skipcat, as used here, has the I/O behaviour of "dd bs=1 count=1" but without the blather on stderr: it skips no bytes and copies one byte, then exits. (The use of /dev/urandom is to ensure that I have no a priori hint which file is being played which time.) I then typed "s" when I thought it was a short-gap file and "l" when I thought it was a long-gap file. I got tired of it after 83 data samples and killed it. I then postprocessed zz.guess and compared it to zz.list: < zz.guess sed -e 's/s/800 /g' -e 's/l/1600 /g' | tr \ \\n | diff -u zz.list - I got exactly two wrong out of 83 (and the stats are about evenly balanced, 39 short files played and 44 long). So I think it's fair to say that, in the right context (an important caveat!), a time difference as short as (1602-802)/44.1=18.14+ milliseconds is clearly discernible to me. This is, of course, a situation designed to perceive a very small difference. I'm sure there are plenty of contexts in which I would fail to notice even 200ms of delay. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-31 00:11, Michael van Elst wrote: On Sat, Dec 30, 2023 at 10:48:26PM +0100, Johnny Billquist wrote: Right. But if you expect high precision on delays and scheduling, then you start also having issues with just random unpredictable delays because of other interrupts, paging, and whatnot. So in the end, your high precision delays and scheduling becomes very imprecise again. So, is there really that much value in that higher resolution? Better than 100Hz is possible and still precise. Something around 1000Hz is necessary for human interaction. Modern hardware could easily do 100kHz. ? If I remember right, anything less than 200ms is immediate response for a human brain. Which means you can get away with much coarser than even 100Hz. And there are certainly lots of examples of older computers with clocks running in the 10s of ms, where human interaction feels perfect. Another advantage is that you can use independent timing (that's what bites in the emulator case where guest and host clocks run at the same rate). I think that is a separate question/problem/issue. That we fail when guest and host run at the same rate is something I consider a flaw in the system. It's technically perfectly possible to run such a combo good, and the fact that we didn't (don't) is just sad (in my opinion). Not sure what you mean by independent timing here. For me, that would be if you had two different clock sources independent of each other. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-30 22:10, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: Being able to measure time with high precision is desierable, but we can already do that without being tickless. We cannot delay with high precision. You can increase HZ to some degree, but that comes at a price. Right. But if you expect high precision on delays and scheduling, then you start also having issues with just random unpredictable delays because of other interrupts, paging, and whatnot. So in the end, your high precision delays and scheduling becomes very imprecise again. So, is there really that much value in that higher resolution? But of course, this all becomes a question of tradeoffs, preferences and desires. Not sure if we need to have an argument about it. I don't know if anyone is working on a tickless design, or how far it has come. I will certainly not complain if someone does it. But I'm personally not feeling much of a lack that we don't have it. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-30 19:43, Martin Husemann wrote: On Sat, Dec 30, 2023 at 06:25:29PM +, Jonathan Stone wrote: You can only do tickless if you can track how much time is elapsing when no ticks fire, or none are pending. I don't see how to do that without a high-res timer like a CPU cycle counter, or I/O bus cycle counter, or what-have-you. Gong fully tickless would therefore end support for machines without such a timer. Is NetBSD ready to do that? Kernels on that machines just would not run fully tickless. Right. There is no reason to assume that all platforms would have to go tickless just because it becomes a possibility. However, I also am not sure how much value tickless adds here. The main reason I know of for tickless systems is power consumption. Not having to wake up just to count time can make a big difference. Sure, you can get higher precision for some scheduling with tickless, but I'm not sure it generally makes any actual significant difference. Being able to measure time with high precision is desierable, but we can already do that without being tickless. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-25 02:17, Robert Elz wrote: Date:Sun, 24 Dec 2023 13:49:53 +0100 From:Johnny Billquist Message-ID: | In my opinion, all of these POSIX calls that take a time argument should | really have been done the same as clock_gettime(), in that you specify | what clock it should be based on. The next version of POSIX will contain pthread_cond_clockwait() which is just like pthread_cond_timedwait() but has a clock_id parameter. | As it is now, it is (or should be according to POSIX) unconditionally | CLOCK_REALTIME. Not sure about the current released standard, and too lazy to look ... but in the coming one that's not true either: The pthread_cond_timedwait() function shall be equivalent to pthread_cond_clockwait(), except that it lacks the clock_id argument. The clock to measure abstime against shall instead come from the condition variable's clock attribute which can be set by pthread_condattr_setclock() prior to the condition variable's creation. If no clock attribute has been set, the default shall be CLOCK_REALTIME. Happy to see that this is finally getting fixed. Working on embedded systems where sometimes you have no clue what the absolute time is, current POSIX was just close to unusable, and we've had to do some pretty horrible workarounds. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-24 20:58, Jonathan Stone wrote: On Sunday, December 24, 2023 at 02:43:55 AM PST, Johnny Billquist wrote: > Oh? So we are actually not POSIX compliant on that one? Interesting. > (POSIX explicitly says that the timeout should be for an absolute time, > which means that if you for example update the clock, moving it > backwards, the timeout should still only happen when that time arrives, > and not after some precomputed number of ticks.) one could keep track, for every timeout, whether it's relative or absolute; and when the time is changed, walk the list of a-yet-unfired timeouts, updating all the "absolute" timeouts by the clock-change delta. One could, indeed. And then it would be compliant. (I'd dislike it, but that's a very personal opinion. :-) ) Anyway .. I wonder if the "clock drift" is related to the clock drift I've heard about, on machines which don't have a hardware cycle-counter-style clock, and rely on clock-tick interrupts to track time. (for example, pmax 2100/3100; decstation 5000/200; (most) vax). I'd really like to help out with clock-drift', if I can do anything to help. I am fairly sure all systems use the clock tick interrupt to track time in the end. No NetBSD port, as far as I know, is running a tickless implementation. But I think the suggestion that the time adjustment might actually be a source of the problem is interesting, and should be investigated. It just takes so bloody long to do a full build these days. I still haven't finished, and can't start chasing this quite yet. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-24 11:43, Johnny Billquist wrote: On 2023-12-24 09:26, Michael van Elst wrote: sim...@netbsd.org (Simon Burge) writes: qemu uses ppoll() which is implemented with pollts() to do emulated timers, so that doesn't help here. I don't know what simh uses,nor any of the other emulators. simh uses pthread_cond_timedwait(). This actually waits using TIMER_ABSTIME for a deadline, but which is converted to a timeout with ts2timo() and passed to sleepq_block() as a number of ticks to wait for. Oh? So we are actually not POSIX compliant on that one? Interesting. (POSIX explicitly says that the timeout should be for an absolute time, which means that if you for example update the clock, moving it backwards, the timeout should still only happen when that time arrives, and not after some precomputed number of ticks.) By the way - I should point out that I am not advocating that we change this to be POSIX compliant. I do think POSIX is broken here. In my opinion, all of these POSIX calls that take a time argument should really have been done the same as clock_gettime(), in that you specify what clock it should be based on. As it is now, it is (or should be according to POSIX) unconditionally CLOCK_REALTIME. But depending on what you are doing, CLOCK_MONOTONIC might be what you really wished you could use. I think Linux have some private extension to the whole thing, which makes it possible to pick other clocks, but I've forgotten the details. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-24 09:26, Michael van Elst wrote: sim...@netbsd.org (Simon Burge) writes: qemu uses ppoll() which is implemented with pollts() to do emulated timers, so that doesn't help here. I don't know what simh uses, nor any of the other emulators. simh uses pthread_cond_timedwait(). This actually waits using TIMER_ABSTIME for a deadline, but which is converted to a timeout with ts2timo() and passed to sleepq_block() as a number of ticks to wait for. Oh? So we are actually not POSIX compliant on that one? Interesting. (POSIX explicitly says that the timeout should be for an absolute time, which means that if you for example update the clock, moving it backwards, the timeout should still only happen when that time arrives, and not after some precomputed number of ticks.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
By the way, I should point out that adding 1 tick to the reload of the interval timer in no way gets you away from the possibility that you'll get two timer signals with almost 0 time between them. Because the simple truth is that it is completely unknown when the program will actually get the signal for the first timeout. Basically, using a similar notation: Time 0*T: timer_settime(.it_value = T, .it_interval = T), arms timer at 2*T Time 2*T: timer expires, rearms timer at 4*T Time 2*T + 1.9*T: process scheduled and signal delivered Time 4*T: timer expires, rearms timer at 6*T Time 4*T + 0.1*T: process scheduled and signal delivered So even though we added one tick, you can still get two timer events in much closer proximity than a single tick as far as the process is concerned. So the current code actually does nothing to avoid the situation, but prevents running a timer at the same frequency as the system tick. And we probably do need to talk about the timer expiration and rearming as separate from signal deliver and process scheduling. Timer expiration and rearming is happening fairly reliably always close to correct time. However, signal delivery and scheduling can be way off. And from a program point of view, that is what really matters in the end. If the program really wants a minimum amount of time before the next timeout, it needs to do the request for the next time event at the processing point, not something kernel internal which happend very disconnected from the process. Johnny On 2023-12-24 02:22, Johnny Billquist wrote: On 2023-12-23 23:05, Taylor R Campbell wrote: The attached (untested) patch reverts to the old algorithm specifically for the case of rearming a periodic timer, leaving the new algorithm with +1 in place for all other uses. Now, it's unclear to me whether this is correct, because it can have the following effect. Suppose ticks happen on intervals of time T. Then, owing to unpredictable and uncontrollable scheduling delays, the following sequence of events may happen: Time 0*T: timer_settime(.it_value = T, .it_interval = T), arms timer at 1*T Time 1*T + 0.9*T: timer expires, rearms timer at 2*T Time 2*T + 0.1*T: timer expires, rearms timer at 3*T The duration between these consecutive expirations of the timer is 0.2*T, even though we asked for an interval of T. Of course, the _average_ duration will be around T, but the _minimum_ duration is essentially zero. POSIX clearly forbids a _one-shot_ timer, which is scheduled to expire after time T, to actually expire after only time 0.2*T. But the language in POSIX is unclear to me on whether this is allowed for _periodic_ timers: I would argue that the first timeout should not happen with less than T time, so the rounding up of that one is correct. The rearming should be with T. The fact that the user level program might the events with a 0.2*T interval (could even be infinitely close to 0 actually) is an effect of how systems work. You can always have higher priority stuff going on, which delays the scheduling of a process for an unknown amount. Even in soft realtime systems, that is the case. Hard realtime is pretty tricky actually, and a lot of the time, people trying to do realtime don't understand the problems around this. Anyway, this, in the end, goes to the point of what the purpose/usecase/reason for the itimer is. I would like/prefer that they try to give that average interval. Because if not, then it have no difference to the process just doing the call in the timer signal handler itself, and the interval parameter becomes just a convenience to not have to do the call youself, and there is no way to get something that essentially gives a higher chance of getting the average rate to what you ask for. Which seems poorer than having the ability to have both the average, and a way of having a minimum interval. But when it comes to times and timeouts, POSIX have some serious flaws already, and POSIX is close to unusable for anything realtime related anyway. (Try to do a pthread_cond_timedwait() for a specific number of seconds, and consider a time change in between and you'll see the problem.) So it's questionable if POSIX is useful/helpful here anyway. But maybe (as Mouse suggested) we should add/have a second interface which actually provides what is useful, if POSIX insists on an always minimum interval interpretation. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-23 23:05, Taylor R Campbell wrote: The attached (untested) patch reverts to the old algorithm specifically for the case of rearming a periodic timer, leaving the new algorithm with +1 in place for all other uses. Now, it's unclear to me whether this is correct, because it can have the following effect. Suppose ticks happen on intervals of time T. Then, owing to unpredictable and uncontrollable scheduling delays, the following sequence of events may happen: Time 0*T: timer_settime(.it_value = T, .it_interval = T), arms timer at 1*T Time 1*T + 0.9*T: timer expires, rearms timer at 2*T Time 2*T + 0.1*T: timer expires, rearms timer at 3*T The duration between these consecutive expirations of the timer is 0.2*T, even though we asked for an interval of T. Of course, the _average_ duration will be around T, but the _minimum_ duration is essentially zero. POSIX clearly forbids a _one-shot_ timer, which is scheduled to expire after time T, to actually expire after only time 0.2*T. But the language in POSIX is unclear to me on whether this is allowed for _periodic_ timers: I would argue that the first timeout should not happen with less than T time, so the rounding up of that one is correct. The rearming should be with T. The fact that the user level program might the events with a 0.2*T interval (could even be infinitely close to 0 actually) is an effect of how systems work. You can always have higher priority stuff going on, which delays the scheduling of a process for an unknown amount. Even in soft realtime systems, that is the case. Hard realtime is pretty tricky actually, and a lot of the time, people trying to do realtime don't understand the problems around this. Anyway, this, in the end, goes to the point of what the purpose/usecase/reason for the itimer is. I would like/prefer that they try to give that average interval. Because if not, then it have no difference to the process just doing the call in the timer signal handler itself, and the interval parameter becomes just a convenience to not have to do the call youself, and there is no way to get something that essentially gives a higher chance of getting the average rate to what you ask for. Which seems poorer than having the ability to have both the average, and a way of having a minimum interval. But when it comes to times and timeouts, POSIX have some serious flaws already, and POSIX is close to unusable for anything realtime related anyway. (Try to do a pthread_cond_timedwait() for a specific number of seconds, and consider a time change in between and you'll see the problem.) So it's questionable if POSIX is useful/helpful here anyway. But maybe (as Mouse suggested) we should add/have a second interface which actually provides what is useful, if POSIX insists on an always minimum interval interpretation. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-23 17:39, Taylor R Campbell wrote: Date: Fri, 22 Dec 2023 23:41:47 -0500 (EST) From: Mouse Specifically, under a kernel built with HZ=100, requesting signals at 100Hz actually delivers them at 50Hz. This is behind the clock running at half speed on my VAX emulator, and quite likely behind similar behaviour from simh (which emulates VAXen, among other things) on 9.3. I suspect it will happen on any port when requesting signals one timer tick apart (ie, at HZ Hz). This is the well-known problem that we don't have timers with sub-tick resolution, PR kern/43997: https://gnats.netbsd.org/43997 You are somewhat missing the point. What Mouse is asking for is not at all about sub-tick resolution. What he asks for is exactly tick resolution. And the important second half of this point is that with an interval timer, it automatically is reset every time it has fired, which means it happens exactly at the tick, and then you want a new trigger at the next tick. And currently, that is not possible. Because the reset of the interval timer rounds one tick up to two ticks (it basically always add one tick, no matter what interval you ask for, which implies that if you ask for 1s, it will actually be after 101 ticks (if we have a 10ms tick)). So nothing is sub-tick about this. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-23 16:53, Mouse wrote: [...], but we are in fact rounding it up to the double amount of time between alarms/interrupts. Not what I think anyone would have expected. Quite so. Whatever the internals behind it, the overall effect is "ask for 100Hz, get 50Hz", which - at least for me - violates POLA hard. It is, when the system clock is running at 100Hz. If the system clock is running at 100Hz, you could expect it should be possible to have an interval timer that runs at 100Hz. But it seems at the moment, we basically have that interval timers can run at most at HZ/2. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PSA: Clock drift and pkgin
On 2023-12-23 14:35, Mouse wrote: } else if (sec <= (LONG_MAX / 100)) ticks = (((sec * 100) + (unsigned long)usec + (tick - 1)) / tick) + 1; The delay is always rounded up to the resolution of the clock, so waiting for 1 microsecond waits at least 10ms. But it is increased by 1 tick when it is an exact multiple of the clock resolution, too. For sleeps, that makes some sense. For timer reloads, it doesn't. I would probably agree that for timer reloads it should not do that rounding up when the interval is evenly divisible. It is a very different case than a sleep. For the reload we know that this happens at a specific time. I could of course be wrong about that code being responsible, but reading realtimerexpire() makes me think not; it uses tshzto, which calls tstohz, which calls tvtohz, which is where the code quoted above comes from. Maybe realtimerexpire should be using other code? Agreed. Two options are to increase HZ on the host as suggested, or halve HZ on the guest. I suppose actually fixing the bug isn't an option? I don't know whether that would mean using different code for timer reloads and sleeps or what. But 1.4T is demonstration-by-example that it is entirely possible to get this right, even in a tickful system. (I don't know whether 1.4T sleeps may be slightly too short; I haven't tested that. But, even if so, fixing that should not involve breaking timer reloads.) A tickless system do not fundamentally change anything either. You can't go below the resolution of a timer, and sleeps are supposed to be sleeping for *at least* the given time, but it could be more. But in this case, we end up where the expected, and reasonable behaviour would be to get alarms/interrupts at the specified frequency, because it is the resolution of the clock, but we are in fact rounding it up to the double amount of time between alarms/interrupts. Not what I think anyone would have expected. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Unexpected out of memory kills when running parallel find instances over millions of files
I think I reported this something like 20 years ago, but noone really seemed to care. I noticed it pretty much right away after NetBSD switched to the unified memory thing, where all free memory usually was grabbed as disk cache. It was not fun on VAX, but at the time it seem other platforms didn't suffer enough to consider it a problem. I guess over time it's just gotten worse... Johnny On 2023-10-21 13:01, Manuel Bouyer wrote: On Fri, Oct 20, 2023 at 10:26:05PM +0200, Reinoud Zandijk wrote: Hi, On Thu, Oct 19, 2023 at 11:20:02AM +0200, Mateusz Guzik wrote: Running 20 find(1) instances, where each has a "private" tree with million of files runs into trouble with the kernel killing them (and others): [ 785.194378] UVM: pid 1998.1998 (find), uid 0 killed: out of swap [ 785.194378] UVM: pid 2010.2010 (find), uid 0 killed: out of swap [ 785.224675] UVM: pid 1771.1771 (top), uid 0 killed: out of swap [ 785.285291] UVM: pid 1960.1960 (zsh), uid 0 killed: out of swap [ 785.376172] UVM: pid 2013.2013 (find), uid 0 killed: out of swap [ 785.416572] UVM: pid 1760.1760 (find), uid 0 killed: out of swap [ 785.416572] UVM: pid 1683.1683 (tmux), uid 0 killed: out of swap This should not be happening -- there is tons of reusable RAM as virtually all of the vnodes getting here are immediately recyclable. $elsewhere I got a report of a workload with hundreds of millions of files which get walked in parallel -- a number high enough that it does not fit in RAM on boxes which run it. Out of curiosity I figured I'll check how others are doing on the front, but key is that this is not a made up problem. I can second that. I have had UVM killing my X11 when visiting millions of files; it might have been using rump but I am not sure. What struck me was that swap was maxed out but systat showed something like 40gb as `File'. I haven't looked at the Meta percentage but it wouldn't surpise me if that was also high. Just some random snippet: I've seen it too, although it didn't end up killing processes. But the nightly jobs (usual daily/security+ backup) ends up pushing to swap lots of processes, while the file cache grows to more than half the RAM (I have 16Gb). As a result the machine is really slow and none of the nightly jobs complete before morning. Decreasing kern.maxvnodes helps a lot. -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: panic options
On 2023-09-12 23:55, Robert Swindells wrote: Is there any way to get panic(9) to behave differently in some places than others? There is a call to panic() if the kernel detects that there is no console device found, I would like to make this call to it just reboot without dropping into ddb. The amdgpu driver fails to initialize about 9 times in 10 for me so would like to reduce the amount of typing needed. Well. Not completely what you ask for, but... sysctl -w ddb.onpanic=0 (or even -1) You could place that in /etc/sysctl.conf as well. See sysctl(7) for some more details. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: [PATCH] style(5): No struct typedefs
On 2023-07-14 11:56, Martin Husemann wrote: On Fri, Jul 14, 2023 at 01:00:26AM +0200, Johnny Billquist wrote: Using "typedef struct bus_dma_tag *" instead of "typedef void *" will do the same thing. That is no reason at all why to skip the typedef. We want to avoid having to #include the header where that typedef lives. The typedef itself buys you nothing but a few charactes less to type, but it introduces another layer of indirection where you could (accidently) introduce inconsistencies (unless you include it from some common header, which brings us back to the start). It also brings that if you want to change the definition, you change it in one place, and not 1000. But I see that your desire is to not have a common place for definitions of shared things. That is certainly a choice as well. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: [PATCH] style(5): No struct typedefs
On 2023-07-14 00:22, Taylor R Campbell wrote: Date: Tue, 11 Jul 2023 19:50:19 -0700 From: Jason Thorpe On Jul 11, 2023, at 2:56 PM, Taylor R Campbell wrote: I agree the keyword is ugly, and it's unfortunate that in order to omit it we would have to use C++, but the ugliness gives us practical benefits of better type-checking, reduced header file maintenance burden, and reduced motivation for unnecessary header file dependencies. No -- you just don't have to use "void *". Can you point to a practical problematic example? Using `struct bus_dma_tag *' instead of `void *' (whether via the bus_dma_tag_t alias or not) would provide better type-checking. Using "typedef struct bus_dma_tag *" instead of "typedef void *" will do the same thing. That is no reason at all why to skip the typedef. And I totally agree that void * is usually something to be avoided, if possible. But I still fail to see what it has to do with the topic on typedef or not. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: [PATCH] style(5): No struct typedefs
On 2023-07-11 15:28, Mouse wrote: I don't get it. Why the "void *" stuff? That is where I think the real badness lies, and I agree we should not have that. But defining something like typedef struct bus_dma_tag *bus_dma_tag_t; would mean we could easily change what bus_dma_tag_t actually is, keeping it opaque, while at the same time keeping the type checking. Um, no, you get the type checking only as long as "what [it] actually is" is a tagged type - a struct, union, or (I think; I'd have to check) enum. Make it (for example) a char *, or an unsigned int, and you lose much of the typechecking. Maybe I missed your point. Yes, if you typedef something based on some simple type like int, that it's no different than any other int. typedefs in C don't really create new types. They are all just derivatives. Sometimes I even wonder why typedef exists in C. Feels like I could accomplish the same with a #define Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: [PATCH] style(5): No struct typedefs
On 2023-07-11 15:28, Mouse wrote: I don't get it. Why the "void *" stuff? That is where I think the real badness lies, and I agree we should not have that. But defining something like typedef struct bus_dma_tag *bus_dma_tag_t; would mean we could easily change what bus_dma_tag_t actually is, keeping it opaque, while at the same time keeping the type checking. Um, no, you get the type checking only as long as "what [it] actually is" is a tagged type - a struct, union, or (I think; I'd have to check) enum. Make it (for example) a char *, or an unsigned int, and you lose much of the typechecking. Sure. If you declare something as derived from char *, then anything that expects something char * would be happy with it. But if you say typedef struct bus_dma_tag *bus_dma_tag_t; then any function that expects this will not be happy with something that translates to char *. Basically, you get as much type checking as you could ever expect/require/demand. But I could agree with your point of not hiding the pointer in the typedef, and have it explicit as well, for some situations. But for something this opaque, I would actually think the pointer in there makes sense. Example: --- typedef struct dma_bus_tag *dma_bus_tag_t; typedef struct dma_bus2_tag *dma_bus2_tag_t; int test(dma_bus_tag_t foo) { return 0; } int foo(void) { dma_bus2_tag_t x; return test(x); } --- This fails at compilation because x is not of the correct type for the function test(). If you change x to be of type dma_bus_tag_t, then the compilation is happy. Typechecking just as you would expect it, while being totally unaware of what this type actually looks like. Something that actually uses the value obviously needs to have the full definition of the structure, and dereference it, and so on. But all other code do not need any of that, and can be kept totally in the dark. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: [PATCH] style(5): No struct typedefs
On 2023-07-11 12:17, Taylor R Campbell wrote: I propose the attached change to KNF style(5) to advise against typedefs for structs or unions, or pointers to them. Passing around pointers to structs and unions doesn't require the definition, only a forward declaration: struct vnode; int frobnitz(struct vnode *); Agreed. But this is unrelated to typedefs. This can dramatically cut down on header file dependencies so that touching one header file doesn't cause the entire kernel to rebuild. This also makes it clearer whether you are passing around a copy of a struct or a pointer to a struct, which is often semantically important not to conceal. Typedefs are not necessary for opaque cookies to have labels, like `typedef void *bus_dma_tag_t;'. In fact, using void * here is bad because it prevents the C compiler from checking types; bus_dma_tag_t is indistinguishable from audio_dai_tag_t, and from pckbc_tag_t, and from acpipmtimer_t, and from sdmmc_chipset_handle_t. I don't get it. Why the "void *" stuff? That is where I think the real badness lies, and I agree we should not have that. But defining something like typedef struct bus_dma_tag *bus_dma_tag_t; would mean we could easily change what bus_dma_tag_t actually is, keeping it opaque, while at the same time keeping the type checking. Basically, getting all the benefits you mention from having it as a proper type, but still also keeping the ability to change what it actually is without any problems. So, yes to proper forward declarations. But I don't think typedefs as such is a problem. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Compilation failures in current
Clean out your object files and directories. Sometimes old stuff hanging around there can prevent a build from succeeding. Johnny On 2023-06-21 15:43, bsd...@tuta.io wrote: Jun 20, 2023, 23:52 by mar...@duskware.de: On Wed, Jun 21, 2023 at 05:29:08AM +0200, bsd...@tuta.io wrote: Since past 3?4 days, I am getting consistent failures while compilingthe kernel ? current. Did you update the whole source tree? How are you compiling your kernel? Martin Yes, I updated the whole tree. I was just building the kernel manually,using make. But based on Rin Okuyama's comment, I am trying to build the kernel using build.sh, but building the tools fail for me. ./build.sh -U -j3 -m amd64 -a x86_64 tools --- includes --- cd: can't cd to include *** Failed target: includes *** Failed commands: @(cd include && find . -name '*.h' -print | while read f ; do ${HOST_INSTALL_FILE} $$f ${HOST_INCSDIR}/compat/$$f ; done) => @(cd include && find . -name '*.h' -print | while read f ; do /usr/src/tools/binstall/obj/xinstall -c -r $f /usr/src/obj/tooldir.NetBSD-10.99.4-amd64/include/compat/$f ; done) *** [includes] Error code 2 nbmake[2]: stopped in /usr/src/tools/compat 1 error nbmake[2]: stopped in /usr/src/tools/compat *** Failed target: install-compat *** Failed command: _makedirtarget() { dir="$1"; shift; target="$1"; shift; case "${dir}" in /*) this="${dir}/"; real="${dir}" ;; .) this=""; real="/usr/src/tools" ;; *) this="${dir}/"; real="/usr/src/tools/${dir}" ;; esac; show=${this:-.}; echo "${target} ===> ${show%/}${1:+ (with: $@)}"; cd "${real}" && /usr/src/obj/tooldir.NetBSD-10.99.4-amd64/bin/nbmake _THISDIR_="${this}" "$@" ${target}; }; _makedirtarget compat install *** Error code 2 -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: PROPOSAL: Split uiomove into uiopeek, uioskip
On 2023-05-10 14:00, Jason Thorpe wrote: On May 9, 2023, at 3:09 PM, Taylor R Campbell wrote: - uiopeek leaves uio itself untouched (it may modify the source/target buffers but it's idempotent). Hm… I’m having second thoughts about uiopeek(), as well. It implies a direction (“peek” feels like “read”, and “write” would feel more like a “poke”). I think uiocopy() is a better name, and I think it is sufficiently different from uiomove() (“move” implies a sortof destructive-ness that “copy” does not). I would sortof agree. But for me, "peek" more suggests that you are looking at the content, but intentionally leaving the data in there for something else to later pick up/process. copy seems to more appropriately describe what the intent is. Also, skip for also implies that you are skipping over the data, intentionally not interested in it. After a copy, I would feel it would be more describing to say advance rather than skip (or if someone else have another good verb for it, I'm all ears...) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Per-descriptor state
On 2023-05-08 02:09, Johnny Billquist wrote: On 2023-05-08 01:30, David Holland wrote> On Fri, May 05, 2023 at 09:44:25PM -0400, Mouse wrote: > not nearly as > horrible as, say, the hack I once implemented where calling > wait4(0x456d756c,(int *)0x61746f72,0x4d616769,(struct rusage *)0x633a2d29) ^^ > would produce suitably magic effects. (This was on a 32-bit machine.) surely you mean 39! Hmm? "EmulatorMagic:-)" don't do it for you, you want "EmulatorMagic:=)" ? I wonder what the emoji " :=) " means... Double nose? Ok. I'm being very silly. I need to go to bed... :-) And I'm clearly half asleep already. Sorry... :-9 it was... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Per-descriptor state
On 2023-05-08 01:30, David Holland wrote> On Fri, May 05, 2023 at 09:44:25PM -0400, Mouse wrote: > not nearly as > horrible as, say, the hack I once implemented where calling > wait4(0x456d756c,(int *)0x61746f72,0x4d616769,(struct rusage *)0x633a2d29) ^^ > would produce suitably magic effects. (This was on a 32-bit machine.) surely you mean 39! Hmm? "EmulatorMagic:-)" don't do it for you, you want "EmulatorMagic:=)" ? I wonder what the emoji " :=) " means... Double nose? Ok. I'm being very silly. I need to go to bed... :-) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: building 9.1 kernel with /usr/src elsewhere?
You should build the kernel using build.sh, with the tools and all from there. ./build.sh kernel=foobar Don't try to make things complicated by doing all that stuff by hand. :-) Johnny On 2023-03-08 00:32, Mouse wrote: Okay, I'm trying to help someone with a NetBSD 9.1 machine at work. Today's issue is one I had trying to build a kernel. We needed to do this because that project has a few customziations in the system needed to build the application layer, and more needed to run it. He installed stock 9.1, but did not install any source; /usr/src and /usr/xsrc did not exist. We then set up the customized source trees in his homedir, which I will here call /home/abcxyz, under a directory 9.1. Thus, the path to kernel source, for example, was /home/abcxyz/netbsd-9.1/usr/src/sys. Then I copied in a kernel config (GEN91) into my ~/kconf/GEN91, from back when I was working on that project. I then ran % config -b ~/kbuild/GEN91 -s /home/abcxyz/netbsd-9.1/usr/src/sys ~/kconf/GEN91 This completed apparently normally, reporting the build directory and telling me to remember to make depend. I then went to ~/kbuild/GEN91 and ran make depend && make. It failed fast - no more than a second or two - with make[1]: don't know how to make absvdi2.c. Stop (full log below). I then moved /home/abcxyz/netbsd-9.1/usr/{src,xsrc} to /usr/{src,xsrc}, chowned them -R to 0, destroyed ~/kbuild/GEN91, and repeated, only this time I passed /usr/src/sys to config's -s flag. This time the kernel built fine (at least apparently - we haven't tried booting it yet, but I've built enough kernels to be confident there are no obvious red flags in the log; it certainly did not fail a second or two in with a cryptic message about absvdi2.c). Note in particular that the source tree content was identical; only the path and ownership differed. Is this a bug? Or am I doing something wrong? /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B The logfile includes one line that's over 3300 characters long, guaranteeing that it'd get mangled by email (soft limit 78 characters, hard limit 998). So, I ran the logfile through bzip2 and then base64-encoded the output. Here's the result: QlpoOTFBWSZTWfzLRGEAA6xfgGAQSZP/cj/v36qwUANjwoKAAGSNUbUn6mpv U1NPU8hAZB6mQA9EMymmagOaMmJgAmIwI0wIMRgmTAIw5oyYmACYjAjTAgxGCZMA jDmjJiYAJiMCNMCDEYJkwCMEUQkwmQJoBR6jR5QA02ppo0Gm2qaekaI6tlUVe/S1 HAv0T332gt7Q3uZ7SnDMNrMXxRm2qdT965avS81vYtWJqOVqFHMMG2lXjZAzLhEs 27kbS8RZOB1YXdRZGFfB+Vv64SXOnJbdALlWMEEIQABCjGnQhAOAkGYImXVAYFSV gDpIIgIk4iJU661xD4IOouYQv7DytQCYMTqXUIYqMGgCfbG40vx0e6+cYsRMxWbG Ry7dS5oRQAwI3XHxETDWFMrFa1+URPAsCmnoETwfSvQInSHI418MxAY/xIzvVCzA h8iP4i6OcEEPxhJRZKmkZ6dtYVdUDx1F0h7z+ryOBD/FDb5RQsf2pwBlyggpIeVR 92uNxc3aZYSUPe2TV3aWC5CNOUBELwKW0ESTM5thvQmeUmlz7thjtk0ZKaUlm7gU OASVllCpleWIQpx0y3morldhNMw1tTjtzvoRkc3rx23xrveRHQbDZTLTOChswIDD Hd1nzGHP6A6yTEnMgL84Z9SCSxgYH37YdFFZdV2mdEHoHkd4UEEKgyDwOZDxQfyI LY3SF3TJCfWg+wRPXIPqESyDXzK+OW6lP1KEJkIlSAsAbcBcD4Z3iJiVKGIiUIVP 2N8kaG0HbAXu37Pbdt2AG2OyBH02gG/nzEShoSSVebJPP+Tvi5LgcVHE0ETWo+og lPYdofIamwKb8sqHQKEUOcsc6XWAN4TwoSAdclLBN1QIjmJqi9pYEKRoAYHCIhB2 JbpbzqBS+ki6xaG47NWFzrJmUkSRYUZQK3CJKcZTX/htKF+MY7OgFIFyd+gvIvDY 9NSBvxpfE85ThkVMyOwRJdkiJYLaw0Bb1Qzg77rchEyOLkoB1ApmIlADnLcxULoE TYnnCmYiQ6/ao6pPnJ+4sPC3GA6D4mvuANDYUESE1XGG9JKlu+DVtUdYidGAibhE 4iJ/4u5IpwoSH5lojCA= -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.99.100 fallout: bootloader and modules
Searching for uses of netbsd_version, there is some more broken logic in a few places, following similar patterns or assumptions. Like in /usr/src/sys/arch/i386/stand/lib/exec.c: if (netbsd_version / 100 % 100 == 99) { /* -current */ snprintf(buf, bufsize, "/stand/%s/%d.%d.%d/modules", machine, netbsd_version / 1, netbsd_version / 100 % 100, netbsd_version / 100 % 100); } else if (netbsd_version != 0) { /* release */ snprintf(buf, bufsize, "/stand/%s/%d.%d/modules", machine, netbsd_version / 1, netbsd_version / 100 % 100); } So just changing the modulo in the place you were suggesting isn't enough. :-P Johnny On 2022-09-21 13:47, Taylor R Campbell wrote: The x86 bootloader, and the MI efiboot, are unable to find modules when the kernel version is 9.99.100 -- they try /stand/$ARCH/9.99.0/modules instead, because of this logic: /* sys/arch/i386/stand/lib/exec.c */ snprintf(buf, bufsize, "/stand/%s/%d.%d.%d/modules", machine, netbsd_version / 1, netbsd_version / 100 % 100, netbsd_version / 100 % 100);/* XXX */ /* sys/stand/efiboot/module.c */ const u_int vmajor = netbsd_version / 1; const u_int vminor = netbsd_version / 100 % 100; const u_int vpatch = netbsd_version / 100 % 100;/* XXX */ I will try the attached patch to do `% 1' instead of `% 100' on the lines marked XXX. Likely other bootloaders will need to be adjusted to handle this. Loading modules from the bootloader in a =9.99.100 kernel will require updating the bootloader. (After boot, module loading works fine because the kernel's module loader uses the `osrelease' string instead of doing arithmetic on __NetBSD_Version__.) -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-15 18:41, Mouse wrote: By the way. This obviously does not at all solve the problem that the OP had. He was writing code with the expectation that malloc() should fail. [...] A killed process won't make the OP happy, even if it was his own program/process. I'm not sure that last sentence is true. As I read it, the reaction to malloc failing would have been that we've put as much pressure on as we can, so it's time to exit. If so, killing that process is a reasonable reaction - and that's what my proposed approaches were based on. Perhaps my understanding is wrong, in which case not much will help except no overcommit. I might have misunderstood the purpose. I thought there was a desire to stress things a bit once you get to the memory full state to see how things were behaving. So you might be absolutely right. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
By the way. This obviously does not at all solve the problem that the OP had. He was writing code with the expectation that malloc() should fail. For this, we need to get something that will not allow overcommitting memory, and where malloc() can then return an error instead of the process getting killed. A killed process won't make the OP happy, even if it was his own program/process. Johnny On 2022-06-15 17:41, Johnny Billquist wrote: On 2022-06-15 16:01, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: They might be the reason for the memory shortage. You can prefer large processes as victims or protect system services to keep the system managable. So when one process tries to grow, you'd kill a process that currently have no issues in running? All processes have issues on that system and the goal is to keep things alive so that you can recover, a system hang, crash or reboot is the worst outcome. Maybe, but not definitely. And the outcome is in general processes being killed, which basically should never result in an outright crash or reboot. Not even a hang, although if the wrong process is killed, you might end up not being able to access the system, so it's a bit of a grey area. Obviously there is no heuristic that can predict what action will have the best outcome and which causes the least damage. Guessing on the cost of various kinds of damage is an impossible task by itself as that is fairly subjective. Agreed. But the one thing that is known at specific point in time is that there is one process who needed one more page, which could not be satisfied. All the other processes at that moment in time are not in trouble. Which also means, we do not know if killing another process is enough to keep this process going, and we do not know if that other process would ever get into trouble at all. So we are faced with the choice of killing one process we know are in trouble, or speculatively kill something else, and then hope that would help. The suggestion that we'd add some kind of hinting could at least help some, but it is rather imperfect. And if we don't have any hints, we're back in the same place again. But there can be a heuristic that helps in many cases, and for the rest you can hint the system. If you can come up with some heuristics, it would be interesting to see them. I don't see any easy ones. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-15 16:01, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: They might be the reason for the memory shortage. You can prefer large processes as victims or protect system services to keep the system managable. So when one process tries to grow, you'd kill a process that currently have no issues in running? All processes have issues on that system and the goal is to keep things alive so that you can recover, a system hang, crash or reboot is the worst outcome. Maybe, but not definitely. And the outcome is in general processes being killed, which basically should never result in an outright crash or reboot. Not even a hang, although if the wrong process is killed, you might end up not being able to access the system, so it's a bit of a grey area. Obviously there is no heuristic that can predict what action will have the best outcome and which causes the least damage. Guessing on the cost of various kinds of damage is an impossible task by itself as that is fairly subjective. Agreed. But the one thing that is known at specific point in time is that there is one process who needed one more page, which could not be satisfied. All the other processes at that moment in time are not in trouble. Which also means, we do not know if killing another process is enough to keep this process going, and we do not know if that other process would ever get into trouble at all. So we are faced with the choice of killing one process we know are in trouble, or speculatively kill something else, and then hope that would help. The suggestion that we'd add some kind of hinting could at least help some, but it is rather imperfect. And if we don't have any hints, we're back in the same place again. But there can be a heuristic that helps in many cases, and for the rest you can hint the system. If you can come up with some heuristics, it would be interesting to see them. I don't see any easy ones. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-15 11:09, David Brownlee wrote: On Wed, 15 Jun 2022 at 08:31, Johnny Billquist wrote: On 2022-06-15 06:57, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: I don't see any realistic way of doing anything with that. It's basically the first process that tries to allocate another page when there are no more. There are no other processes at that moment in time that have the problem, so why should any of them be considered? They might be the reason for the memory shortage. You can prefer large processes as victims or protect system services to keep the system managable. So when one process tries to grow, you'd kill a process that currently have no issues in running? Which means you might end up killing a lot of non-problematic processes because of one runaway process? Seems to me to not be a good decision. As opposed to the process which had a successful malloc some time ago and is running without issues, and is just about to try to use some of its existing allocation? That is speculation, which is my problem here. You are trading a known requester of non-existant memory for speculation that another process *might* want non-existant memory. Both options are wrong in some cases. Having a way to influence the order in which processes are chosen would seem to be the best way to end up with a better outcome. The existing behaviour should remain an option, but (at least for me) it would not be the one chosen I (obviously) disagree. :-) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-15 06:57, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: I don't see any realistic way of doing anything with that. It's basically the first process that tries to allocate another page when there are no more. There are no other processes at that moment in time that have the problem, so why should any of them be considered? They might be the reason for the memory shortage. You can prefer large processes as victims or protect system services to keep the system managable. So when one process tries to grow, you'd kill a process that currently have no issues in running? Which means you might end up killing a lot of non-problematic processes because of one runaway process? Seems to me to not be a good decision. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-14 20:47, Mouse wrote: What might be interesting is a way to influence the order in which processes are chosen to kill... I don't see any realistic way of doing anything with that. It's basically the first process that tries to allocate another page when there are no more. There are no other processes at that moment in time that have the problem, so why should any of them be considered? To answer that, consider the original poster's situation: > I have a program that keeps malloc()ing (and scribbling a bit > into the allocated memory) until malloc() fails. The > intention is to put pressure on the VM system to find out how > much pool cache memory it can reclaim. Such a program would be a prime candidate for declaring itself a preferred out-of-swap victim. SunOS chill(1) - or was it chill(8)? - might be another example, though that's of minimal relevance to NetBSD. Well. First of all, such a program don't exist, as the malloc is not failing. It probably wouldn't be easy - the process which incurred the page fault would have to be put to sleep pending the death of the victim process - but it could provide for much better behaviour in situations like this. Second - are you proposing that you'd keep some kind of statistics on mallocs done in the past in some way, in order to decide that this process should now be a candidate for a kill when you run out of pages? Are you sure it is a better candidate than the current process, who actually are demanding more pages at the moment? The other process might not in fact ever be hitting the condition, and would run on happily ever after, even though it did call malloc a number of times earlier. Even more, how do we decide that it's actually malloc, or are any memory demand treated equally? And yes, it also raises the question on how to handle the current process that caused a page demand that cannot be fulfilled at the moment. Perhaps even better would be a way for userland to tell the kernel "pretend you're under severe RAM pressure and free what you can" without needing to actually run the system out of pages. Well, the process killing only happens when we are really out of pages, so no amount of "free what you can" helps (unless I'm confused). Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-14 19:57, David Brownlee wrote: On Tue, 14 Jun 2022 at 13:33, Robert Elz wrote: NetBSD implements overcommitted swap - many processes malloc() (or mmap() which that really becomes in the current implementation) far more memory than they're ever going to actually use. It is only when some real physical memory is required (rather than simply a marker "zero filled page might be required here") that the system actually allocates any real resources. Similarly pages mapped from a file only need swap space if they're altered - otherwise the file serves as the backing store for it. Once upon a time there was a method to turn overcommitted swap off, and require actual allocations (of RAM or swap) to be made for all reserved (virtual) memory. I used to enable that all the time - but I haven't seen any mention of it in ages, and the mechanism might no longer still exist. What might be interesting is a way to influence the order in which processes are chosen to kill... I don't see any realistic way of doing anything with that. It's basically the first process that tries to allocate another page when there are no more. There are no other processes at that moment in time that have the problem, so why should any of them be considered? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
On 2022-06-14 12:59, Edgar Fuß wrote: So what should the kernel do? I don't know how thigs work under the hood today (I might have partially known in the times of sbrk()), but I would suppose that malloc() will ultimatively result in some system call enlarging the heap/data segment/whatever. That system call could simply fail. I assume my impression is completely wrong (today). But then, how can a malloc() fail before the process gets killed? Process limits for one. But I guess if your virtual memory becomes fragmented, and you request a too big chunk would be another reason. But malloc today relies on the lazy memory grabbing of the pager. Until you actually reference the memory, it don't yet have to be backed by anything. (Unless I remember something wrong.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: killed: out of swap
It's not the malloc that fails. It's the vm system trying to get a page for you. At which point it might not be your process that is trying to get a page when there are none free... So what should the kernel do? Johnny On 2022-06-14 12:01, Edgar Fuß wrote: I have a program that keeps malloc()ing (and scribbling a bit into the allocated memory) until malloc() fails. The intention is to put pressure on the VM system to find out how much pool cache memory it can reclaim. When I run that program (with swap space unconfigured), it doesn't terminate normally, but gets killed by the kernel with "out of swap". Unfortunately, other processes happening to malloc() during that time may get killed, too. I don't quite get what the rationale for that is (or maybe I'm doing something stupidely wrong). If I malloc(), and that fails, that should fail and not kill me, no? I'm surely missing something. -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Slightly off topic, question about git
efore doing a commit in a centralized model means that your history as well as your code does have a linear true history. And so you'll have to rewrite parts that you already committed in order to get things back to a coherent state. Merging two changesets that affect the same portions of the same files inevitably will require that in some cases. Yes. But there is a difference between sorting it out before or after the commit. This is a nasty problem when you have separate VCSs. Well, it becomes nasty because somewhere in the end, you still have a master VCS, which holds the source of truth. Distributed VCSs are not truly distributed. There is still just one master. Only if the humans involved insist on seeing it that way. There is no technical reason that has to be true. git lends itself very well to the "sure, fork it and see whose fork the userbase prefers" model. Is that a strength or a weakness? Each use case has to decide that for itself. We do tend to see it that way. NetBSD have one master. Sure, you can fork it if you want. But that is no longer NetBSD. Trying to regard this as a popularity model drive is rather broken, I think. If the repo in question is used to produce a product with a single distribution channel, then there will inevitably be some kind of master in the sense of the one used to produce the distribution. But that's inevitable in that case; it's an artifact of the use case, nothing inherent to the underlying VCS. Sure. If you want a completely fragmented world view, then a distributed VCS makes perfect sense. I didn't expect you to take that position, but I'm not going to try to change your mind. But I'm not in that camp. :-) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Slightly off topic, question about git
On 2022-06-06 14:33, Mouse wrote: I've recently come to realize a thing with git I really abhor. It has a very loose view on history immutability. I've seen branches, which claims to come from some point, where the branch is way older than the revision it claims to have been branched off. Which obviously is impossible. But history rewriting seems to be a favorite pastime of git users. That's not a fault of git; that's a fault of how some people use git. Well, you could argue that it's a fault in git that it allows it. If there is a way, then some people will use it that way. For me, one of the really big points of VCS is that history is never changed. I can go back and see what was done, where, to what. And git can be used that way. No VCS is ever truly never-change, unless you use write-once media to store it, and even then it is always vulnerable to reconstructing a new repo from the ground up based on the old repo. Sure. You can change history in CVS as well. But you'll have to go in there and much with the data that is beind. It's not like the UI itself allows you to work that way. And I've not seen anything similar in a whole bunch of other VCSs I've worked with either. But I've generally not worked on distributed once before. And I sortof can see why people want to go that way, since with distributed VCSs, it becomes much harder to have a linear history. But they still want to kindof/sortof fake it. Since git actually is multiple, independent VCSs, what happens on one don't necessarily at all come across to another, and in the process of aligning them, history have to be rewritten to even get close to make some kind of sense. Not really; history doesn't _have_ to be rewritten. That's what merge commits are for. People just choose to rebase work instead of merging. (Personally, I think that's a mistake, for various reasons, but, as you point not, not everyone agrees.) It sortof have to. Since if you've done various work, and others have done various work on the same files, and both have done commits, it might not be possible to merge as is. And so you'll have to rewrite parts that you already committed in order to get things back to a coherent state. This is a nasty problem when you have separate VCSs. Well, it becomes nasty because somewhere in the end, you still have a master VCS, which holds the source of truth. Distributed VCSs are not truly distributed. There is still just one master. It's just about how you work in relation to it. I can see some advantages, but I'm still not sure if they outweigh the disadvantages that I feel. But that is of course very subjective. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Slightly off topic, question about git
On 2022-06-06 11:32, Greg Troxel wrote: David Brownlee writes: I suspect most of this also works with s/git/hg/ assuming NetBSD switches to a mercurial repo Indeed, all of this is not really about git. Systems in the class of "distributed VCS" have two important properties: commits are atomic across the repo, not per file True of most any VCS, distributed or not. It's rather CVS that is the odd man out here. But it's sortof a bit loose (or weird) in distributed VCSs, since you might do that to your local repo, but then it becomes "diluted" when it gets applied to another (upstream) repo. anyone can prepare commits, whether or not they are authorized to apply them to the repo. an authorized person can apply someone else's commit. Anyone can "prepare" a commit on any kind of VCS. It's the actual commit that is always the gate. The difference might more be in how you pass a commit over to someone else to apply. With CVS, I usually create a diff, and send that to someone who have the rights to apply it. Works just fine. (Since I don't have any commit rights on NetBSD for example.) These more or less lead to "local copy of the repo". And there are web tools for people who just want to look at something occasionally.But I find that it's not that big, that right now I have 3 copies (8, 9, current), and that it's nice to be able to do things offline (browse, diff, commit). Local copy is required with git, since everyone actually have their own VCS. And then you have some upstream VCS which you work 2-way with, in relation to your own VCS. Pros and cons, as always. CVS is really just RCS with organization into groups of files ability to operate over ssh (rsh originally :-) That was really great in 1994; I remember what a big advance it was (seriously). True. I've recently come to realize a thing with git I really abhor. It has a very loose view on history immutability. I've seen branches, which claims to come from some point, where the branch is way older than the revision it claims to have been branched off. Which obviously is impossible. But history rewriting seems to be a favorite pastime of git users. For me, one of the really big points of VCS is that history is never changed. I can go back and see what was done, where, to what. Since git actually is multiple, independent VCSs, what happens on one don't necessarily at all come across to another, and in the process of aligning them, history have to be rewritten to even get close to make some kind of sense. I'm not at all convinced this is a good system. But that's just me. :-) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Hinted mmap(2) without MAP_FIXED can fail prematurely
On 2022-02-18 01:05, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: If it would ignore the hint, what's the point of the hint then? With MAP_FIXED it must use the hint, without it's just a best effort attempt. Which then basically means that without MAP_FIXED, the hint don't really mean anything? It will take whatever address it can come up with, no matter what you put into the hint. Which is what a hint of 0 should do. So I don't get that. With MAP_FIXED, the hint needs to be exactly on a page boundary, which makes sense. Without MAP_FIXED, and with a hint, I would expect that things like rounding the address to the proper alignment, and so on, would be allowed, but not that it would just take any address. If I'm ok with it taking any random address, then I shouldn't provide a hint. Anyway, that's just my reflections/thoughts on it. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Hinted mmap(2) without MAP_FIXED can fail prematurely
If it would ignore the hint, what's the point of the hint then? Johnny On 2022-02-17 08:23, PHO wrote: I'm not sure if this is a bug or not, but on NetBSD/amd64 9.2, hinted mmap(2) without MAP_FIXED can fail prematurely with ENOMEM if all the regions below the given hint address are occupied. The man page for mmap(2) suggests that the hint is just a hint. Shouldn't mmap(2) also search for available regions above the hint then? Test code: https://gist.github.com/depressed-pho/a629247b48b3e6178e35a14c62e9d44f#file-mmap-with-hint-c Test result: https://gist.github.com/depressed-pho/a629247b48b3e6178e35a14c62e9d44f#file-gistfile1-txt -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: wsvt25 backspace key should match terminfo definition
On 2021-11-23 18:59, Koning, Paul wrote: On Nov 23, 2021, at 12:53 PM, Valery Ushakov wrote: On Tue, Nov 23, 2021 at 09:22:43 -0500, Greg Troxel wrote: I think (memory is getting fuzzy) the problem is that the old terminals had a delete key, in the upper right, that users use to remove the previous character, and a BS key, upper left, that was actually a carriage control character. [... snip ...] I see the same kbs=^H on vt52. vt52 is different. I spent a lot of time on VT52, VT100, and VT2xx terminals. Me too. :-) All DEC terminals have a "delete" or "rub out" key, which transmits 0177. VT52 and VT100 also have a backspace key, which transmits 010. VT2xx (LK201 keyboard) do not have that key. Or rather, they do have a top row function key that is sometimes labeled BS but it sends an escape sequence, and is not taken seriously by most DEC programmers. It's a bit more convoluted than that. If you set a VT200 in VT100-mode, those three keys on the top row will in fact actually send LF, BS and ESC. Because if you wanted more VT100 compatibility, the VT100 had those keys, so the VT200 also had to have them if in VT100 mode. But yes, if you are in VT200 mode, they will in fact send various escape sequences. Not that it's really relevant here. The key to correct typing errors send a DEL, which is what this is about. DEC software convention always was that the delete/rubout key is how you erase the previous character. Backspace was never used for that, and there wasn't any obvious reason to have it on the keyboard. It is, of course, an output formatting character. Some DEC software used BS, which is why the VT200 in VT100 mode still had it. One of the more known pieces is FMS, which uses BS to step to the previous field, and TAB to step to the next field (there was no possibility of doing shift-TAB back then, which is popular nowadays for that same function). Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: wsvt25 backspace key should match terminfo definition
On 2021-11-23 18:53, Valery Ushakov wrote: On Tue, Nov 23, 2021 at 09:22:43 -0500, Greg Troxel wrote: I think (memory is getting fuzzy) the problem is that the old terminals had a delete key, in the upper right, that users use to remove the previous character, and a BS key, upper left, that was actually a carriage control character. [... snip ...] I see the same kbs=^H on vt52. vt52 is different. I never used a real vt52 or a clone, but the manual at vt100.net gives the following picture: Not really different. The VT52 key that you use to delete characters to the left sends a DEL. https://vt100.net/docs/vt52-mm/figure3-1.html and the description https://vt100.net/docs/vt52-mm/chapter3.html#S3.1.2.3 Key CodeAction Taken if Codes Are Echoed BACK SPACE 010 Backspace (Cursor Left) function DELETE 177 Nothing Yes. And noone would ever be hitting the backspace with the intent of deleting what you just typed. This all originates with the ASR33 actually. There the key was labelled "RUB OUT". And it sends a DEL. vt100 had similar keyboard (again, never used a real one personally) https://vt100.net/docs/vt100-ug/chapter3.html#F3-2 BACKSPACE010 Backspace function DELETE 177 Ignored by the VT100 I've used both VT52 and VT100, as well as almost every other model of VT terminal there is. But vt200 and later use a different keyboard, lk201 (and i did use a real vt220 a lot) https://vt100.net/docs/vt220-rm/figure3-1.html that picture is not very good, the one from the vt320 manual is better https://vt100.net/docs/vt320-uu/chapter3.html vt220 does NOT have a configuration option that selects the code that the Correct. But somehow the official terminfo database has kbs=^H for vt220! Which is wrong. Later it became configurable: https://vt100.net/docs/vt320-uu/chapter4.html#S4.13 Yes. It might only have been configurable if you had specific keyboards, but at least it was possible to change for some. But default is still DEL. For vt320 (where it *is* configurable) terminfo has $ infocmp -1 vt320 | grep kbs kbs=^?, Which I think it should be. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: wsvt25 backspace key should match terminfo definition
If something pretends to be a VT220, then the key that deletes characters to the left should send DEL, not BS... Just saying... Johnny On 2021-11-23 00:48, RVP wrote: The kernel currently defines the backspace key as: $ fgrep CERASE /usr/include/sys/ttydefaults.h #define CERASE 0177 $ This should probably be changed to CTRL('h') to match both the NetBSD and GNU terminfo backspace key definitions, otherwise the key doesn't work right after reset(1) or tset(1): $ infocmp wsvt25 | tr ',' '\n' | fgrep kbs # Reconstructed from /usr/share/misc/terminfo.cdb wsvt25|NetBSD wscons in 25 line DEC VT220 mode, [...] kbs=^H $ /opt/ncurses/bin/infocmp wsvt25 | tr ',' '\n' | fgrep kbs # Reconstructed via infocmp from file: /opt/ncurses/share/terminfo/w/wsvt25 wsvt25|NetBSD wscons in 25 line DEC VT220 mode, [...] kbs=^H $ Thanks, -RVP -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Devices.
On 2021-05-29 22:26, David Holland wrote: There are a number of infelicities in the way we currently handle the I/O plumbing for devices in the kernel. These include: [...] Just looking/thinking about the ioctl part - you say abolish it inside the kernel. So does that mean that we keep the ioctl() interface for user programs, but we'd create a system function that will translate ioctl() calls from user programs to different function calls based on class and function. Which means any time anyone wants to add some new kind of function to a device, you'd have to adjust the ioctl() system call, the class, all devices of that class, and of course your specific driver which you want to do something with the hardware. Or did I misunderstand something here? I'm not trying to shoot anything down at this time. Just trying to understand the implications of the suggestion. I sortof like the idea of getting some generalization, which classes would bring. But the problem with ioctl causing duplication and all kind of efforts is at the same time also a strength in that you can add something new to a device without having to care at all about all other devices and code. Since nothing else really tries to understand ioctls. Flexible, but more effort if you want coherent behavior. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On 2021-05-28 00:46, Joerg Sonnenberger wrote: On Fri, May 28, 2021 at 03:14:24AM +0700, Robert Elz wrote: Date:Thu, 27 May 2021 05:05:15 - (UTC) From:mlel...@serpens.de (Michael van Elst) Message-ID: | mlel...@serpens.de (Michael van Elst) writes: | | >Either direction mstohz or hztoms should better always round up to | >guarantee a minimal delay. | | And both should be replaced by hztous()/ustohz(). While changing ms to us is probably a good idea, when a change happens, the "hz" part should be changed too. hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a measure of time (duration) - converting one to the other makes no sense. "hz" in this context comes from HZ - it is used here as dimension of 1s/HZ. Just like "ms" here is used as "1s/1000". It's a pretty sensible naming compared to much longer naming variants. Well, Robert have a point. If you say hztoms(4), we are not asking for a conversion from 4Hz to ms, but in fact four ticks at the current HZ, converted to ms. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On 2021-05-28 00:13, Robert Elz wrote: Date:Thu, 27 May 2021 20:19:06 + From:"Koning, Paul" Message-ID: <8765ae3a-b5b7-4b67-82ce-93473a5b9...@dell.com> | In this particular case it's converting frequency to period, | that is a sensible conversion. But it isn't, you can't convert 60 ticks/second into some number of milliseconds, the two are different units. Not that much. To quote wikipedia: "The dimension of the unit hertz is 1/time (1/T). Expressed in base SI units it is 1/second (1/s)." So basically, it's just the inverse of time. Based on that, it's pretty clear that conversion to/from time is very valid. And in another reply: Johnny Billquist said: | Frequency essentially means a counting of the number of time something | happens over a specific time period. With hertz, the time period is one | second. Sure. | So then converting the number of times an event | happens in a second into how long it is between two events makes total | sense. It would, but that's not what the functions do. What they do is tell how many ticks occur in a specific number of milliseconds (or vice versa). Your calculation is just (in milliseconds) 1000/hz, and assuming hz isn't varying, is a constant. Good point. It's a conversion from ms (or whatever time) to/from the number of cycles (or ticks of you want) based on the frequency given. I didn't think this through enough. While I certainly believe it's perfectly valid to convert between a frequency and a time for a single cycle, it do become weird to talk about frequency if we are in fact talking about some specific number of cycles, although those cycles can only be converted to a time if we have a frequency. I guess you convinced me. We should really call it something like tickstoms and mstoticks. But that do rely on people then understanding that it is the ticks defined by HZ, and not any random ticks. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On 2021-05-27 22:50, Mouse wrote: A tick is not a duration. A tick is a specific event at a specific time. It has no duration. You have a duration between two ticks. At least as I use it and have heard it used, `tick' can also be used to refer to the interval between two of those events. "How long are timer ticks on this hardware?" or "For the next few ticks, we crunch on this...". I think that is mostly a bit sloppy terminology where they actually want to know the time between two ticks. :-) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On 2021-05-27 22:14, Robert Elz wrote: Date:Thu, 27 May 2021 05:05:15 - (UTC) From:mlel...@serpens.de (Michael van Elst) Message-ID: | And both should be replaced by hztous()/ustohz(). While changing ms to us is probably a good idea, when a change happens, the "hz" part should be changed too. Not sure I agree with that. hz is (a unit of) a measure of frequency, ms (or us) is (a unit of) a measure of time (duration) - converting one to the other makes no sense. It sure does. Frequency essentially means a counting of the number of time something happens over a specific time period. With hertz, the time period is one second. So then converting the number of times an event happens in a second into how long it is between two events makes total sense. What these functions/macros do is convert between ms (or us) and ticks (another measure of a duration), not hz, so the misleading "hz" part of the name should be removed (changed) if a new macro/function is to be invented. (The benefit isn't worth it to justify changing the current existing names, but we shouldn't persist with nonsense if we're doing something new.) A tick is not a duration. A tick is a specific event at a specific time. It has no duration. You have a duration between two ticks. And commonly at every tick you get an interrupt, and you do something, and then resume your normal work. The tick has passed. Waiting for the next one to happen. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On 2021-05-26 12:59, Mouse wrote: I don't fully understand the discussion here. Initially people claimed that HZ at 8000 would be a problem, Well, my experience indicates that it _is_ a problem, at least when using disks at piixide (or pciide). Right. But not for the reason suggested. But obviously there is some kind of a problem somewhere. which for me seems a bit backwards. Me too. I was - am - rather puzzled by it. Right. That was my issue. Claiming that you'd get more problems with rounding and issues with coarsness when running at a higher frequency, when it in fact is the exact opposite. With a higher frequency you have more accuracy and less errors from truncations. Anyway, I have no idea what the actual problem is. Good luck with that part. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: 9.1: boot-time delay? [WORKAROUND FOUND]
On 2021-05-26 11:12, matthew green wrote: Manuel Bouyer writes: On Wed, May 26, 2021 at 05:35:53PM +1000, matthew green wrote: Manuel Bouyer writes: On Tue, May 25, 2021 at 10:46:04PM -, Michael van Elst wrote: bou...@antioche.eu.org (Manuel Bouyer) writes: Another issue could be mstohz() called with a delay too short; mstohz() will round it up to 1 tick. # define mstohz(ms) ((unsigned int)((ms + 0ul) * hz / 1000ul)) actually, this one is problematic as well. mstohz(1) here gives 0 for hz < 1000. "(1 + 0) * hz / 1000" for any 'hz' less than 1000 will give 0. I don't fully understand the discussion here. Initially people claimed that HZ at 8000 would be a problem, which for me seems a bit backwards. And this comment should make that even more obvious. With hz at 8000, you actually get some usable value for mstohz(1), while with low hz definitions, you do not. So why would a high frequency at a clock be a problem? Seems the person who claimed that must have gotten things a bit backwards. I can only see two real problems with a high clock frequency: 1. The overhead of just dealing with clock interrupts increase. 2. If there are things that just give time in ticks, then ticks become very short. And if the assumption is that just one tick is enough, such an assumption can fail. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Reparenting processes?
On 2020-12-08 14:36, Mouse wrote: I've been thinking about building a way to move a job between shells, in particular between one window, ssh session, whatever, and another. Yes, I've somehow missed the VMS virtual terminals, that can detach their job from the real terminal and be reattached by a login process, instead of creating a new job. That must have come in after I stopped using VMS (mid-'80s); I'm pretty sure the VMS I used didn't have it. It actually don't have anything to do with virtual terminals, but the mechanism did exist already then. Jobs could be detached and reattached again, including at login. But it was/is a little obscure in VMS. It is much more prominent, and easy to deal with in TOPS-20. (Now back to on topic... :-) ) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: wait(2) and SIGCHLD
On 2020-08-16 21:17, Mouse wrote: They don't vanish, they get reparented to init(8) which then wakes up and reaps them. That probably would work, approximately, Well, it does work, to at least a first approximation. but isn't what's supposed to happen when a child's parent is ignoring SIGCHLD - the child should skip zombie state, and simply be cleaned up. And how is "reparent to init" not an acceptable means of implementing that? Acceptable or not, it would seem to not match our own documentation. From the sigaction() man-page: SA_NOCLDWAIT If set, the system will not create a zombie when the child exits, but the child process will be automatically waited for. The same effect can be achieved by setting the signal handler for SIGCHLD to SIG_IGN. The difference would be detectable if init were sent a SIGSTOP (assuming that isn't one which would cause a system panic) I don't think it would panic, but I think that, if it really does stop init, it's a bug that it does so. I thought I'd seen some code that rendered init immune to SIGKILL and possibly SIGSTOP too (maybe by forcing them into init's blocked-signals set? I forget). But I can't seem to find it now. SIGSTOP is one of two signals that a process supposedly should not be able to intercept. Of course, init is special enough that normal rules might not apply... so it would stop reaping children (temporarily) - processes of the type in question should not be showing up as zombies. Right, they shouldn't be. But init shouldn't be stopped, either. Similarly, I think it should be impossible to ptrace init, and I have a fuzzy memory that it was on at least one system I tried it on. I'll be poking around a bit more. How special do one really want init to be? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: SIGCHLD and sigaction()
On 2020-08-16 12:49, Johnny Billquist wrote: On 2020-08-15 22:46, Mouse wrote: When I install a SIGCHLD handler via sigaction() using SA_SIGINFO, is it guaranteed that my handler is called (at least) once per death-of-a-child? "Maybe." It depends on how portable you want to be. Historically, "no": in some older systems, a second SIGCHLD delivered when there's already one pending delivery gets, lost same as any other signal. Then someone - POSIX? SVR4? I don't know - decided to invent a flavour of signal that's more like writes to a pipe: multiple of them can be pending at once. Some systems decided this was sane and implemented it. Personally, I don't like it; I think signals should be much like hardware interrupts in that a second instance happening before the first is serviced gets silently merged. While we're on this topic. Unix signals don't exactly work like hardware interrupts anyhow, I suspect, and it's a thing that have constantly befuddled me. As far as I can tell, there is a problematic race condition in the old signal mechanism, and that is the reason (I believe) why the new semantics were introduced). The problem goes like this: You have two child processes. One exit, and you get into your signal handler. In there you then call wait to reap the child and process things. You then call wait again, and repeat until there are no children left to reap, as you only get one signal, even if you get multiple children that exits. When no more unreaped children exist, you exit the signal handler, and a new signal can be delivered. However, what happens if the second child exists between the call to wait, and the exit from the signal handler? It would seem the signal would get lost, since we are in the process of handling the signal, and a new signal is not delivered during this time. In real hardware this usually don't happen, because the actual interrupt request can be reissued by the device while you are in the interrupt handler. There are some hardware interrupt designs, with edge triggered interrupts, where similar problems can exist, and those you have to be very careful with how you handle them so you don't get to the same kind of race condition. Now, have I misunderstood something about how non-queued signal handling works, or is/was there a problem there? Reading the current documentation, I would assume that at the call to the signal handler, the signal is blocked, and also removed from pending signals, so a new even would queue up a new signal to be delivered when returning from the signal handler. However, the text above is from trying to recall how it used to be going back in time, to when you had to re-install the signal handler after each activation. I can't seem to find documentation for how it worked back in the day. I can't even remember when/where I was reading up on that and thinking there might be a problem here, but it was a long time ago. So this is possibly just of historical interest. By the way, I haven't seen any explicit mention of the pending signal being cleared at signal handler entry, so that is just my assumption right now. If that is wrong, then I would expect there is a race condition in there. Maybe someone else knows where that detail is documented? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: SIGCHLD and sigaction()
On 2020-08-15 22:46, Mouse wrote: When I install a SIGCHLD handler via sigaction() using SA_SIGINFO, is it guaranteed that my handler is called (at least) once per death-of-a-child? "Maybe." It depends on how portable you want to be. Historically, "no": in some older systems, a second SIGCHLD delivered when there's already one pending delivery gets, lost same as any other signal. Then someone - POSIX? SVR4? I don't know - decided to invent a flavour of signal that's more like writes to a pipe: multiple of them can be pending at once. Some systems decided this was sane and implemented it. Personally, I don't like it; I think signals should be much like hardware interrupts in that a second instance happening before the first is serviced gets silently merged. While we're on this topic. Unix signals don't exactly work like hardware interrupts anyhow, I suspect, and it's a thing that have constantly befuddled me. As far as I can tell, there is a problematic race condition in the old signal mechanism, and that is the reason (I believe) why the new semantics were introduced). The problem goes like this: You have two child processes. One exit, and you get into your signal handler. In there you then call wait to reap the child and process things. You then call wait again, and repeat until there are no children left to reap, as you only get one signal, even if you get multiple children that exits. When no more unreaped children exist, you exit the signal handler, and a new signal can be delivered. However, what happens if the second child exists between the call to wait, and the exit from the signal handler? It would seem the signal would get lost, since we are in the process of handling the signal, and a new signal is not delivered during this time. In real hardware this usually don't happen, because the actual interrupt request can be reissued by the device while you are in the interrupt handler. There are some hardware interrupt designs, with edge triggered interrupts, where similar problems can exist, and those you have to be very careful with how you handle them so you don't get to the same kind of race condition. Now, have I misunderstood something about how non-queued signal handling works, or is/was there a problem there? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: wait(2) and SIGCHLD
I agree that this is confusing. While the system definitely differentiates between SIG_DFL and SIG_IGN, the difference would normally not be something I expected to make a difference in something described the way wait(2) is documented. I haven't really bothered going down into the code and find the answer, but I'm curious what other answers pops up for this one. Johnny On 2020-08-14 13:51, Edgar Fuß wrote: I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling. The wait(2) manpage says: wait() will fail and return immediately if: [ECHILD]The calling process has no existing unwaited-for child processes; or no status from the terminated child process is available because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD or setting the flag SA_NOCLDWAIT for that signal. However, ignore is the default handler for SIGCHLD. So does the because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD mean that explicitly ignoring SIGCHLD is different from ignoring it per default? -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: style change: explicitly permit braces for single statements
On 2020-07-13 17:52, Paul Goyette wrote: if (ch == t.c_cc[VINTR]) { ...do INTR processing... } else { if (ch == t.c_cc[VQUIT]) { ...do QUIT processing... } else { if (ch == t.c_cc[VKILL]) { ...do KILL processing... } else { ...etc } } } For this, I would prefer if (ch == t.c_cc[VINTR]) { ...do INTR processing... } else if (ch == t.c_cc[VQUIT]) { ...do QUIT processing... } else if (ch == t.c_cc[VKILL]) { ...do KILL processing... } else { ...etc } I would agree. That is a more readable way, I think. In this case, perhaps even a switch might better, assuming that all of the t_c.cc[] are unique: switch (ch) { case t.c_cc[VINTR]) { ...do INTR processing... break; }; case t.c_cc[VQUIT]) { ...do QUIT processing... break; } case t.c_cc[VKILL]) { ...do KILL processing... break; } ...etc In which language would this be? It's not C at least... Syntax is slightly broken, but case values in C must be integer constant expression... But I agree it would be nice if C would allow more flexible options for the case values... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 15:39, Kamil Rytarowski wrote: On 15.06.2020 15:21, Johnny Billquist wrote: Anyway. Who here does not modify their path at login anyway. The path has to be readily available for pkgsrc users with unprepared environment. However if we install the utility into /usr/sys (similar to /usr/games), we can use a full path to the program and it will be good enough (for me). Are there other programs that would be moved to this directory? Using explicit paths is sometimes a good idea no matter what. Obviously I think something like config should be moved out. I would tend to look at /sbin as where tools required to manage the system or get information which "normal" users commonly would not care about. /bin would be where things a large portion of ignorant users programs could be found. Even things like compilers would make sense to have there, as also things like passwd. However, ifconfig for example I would keep out. If can be used even by normal users, in order to look at interface details, but it's outside of what I think a naive user would go for. Same for arp. However, things needed to build kernel and userland is neither normal user tools, nor system administration tools in the normal sense, so neither /bin, nor /sbin really feels like the right place. /sys would make more sense, but I'm not totally clear about if there are aspects I haven't thought about yet. (And I'm currently saying /bin, /sbin and so on, without adding the corresponding /usr/... paths. Traditionally, the things under root were things you expected to need to be able to do without even having file systems mounted and so on, and I like that distinction. So I guess most things related to system building would only need to be in /usr/sys, if that was the path we'd go for.) I have got a feeling that too many programs already rely on specific kernel internals so making a distinction would only confuse people and impose unclear conditions what belongs where. fsdb(8) or crash(8) are definitely not going to be very usable with mixed kernel and userland versions. Something we possibly agree upon is that makesyscalls(1) would not be a tool for administer a computer/server, so /usr/sbin /sbin is not a good place. I agree that /sbin also does not feel natural for makesyscalls. If I were to have to choose between /bin or /sbin, I would however go for /sbin. I have less issues creating kindof weird views for a system admin, which I expect should know a little more about what he is doing and why, than an ignorant user, for which I think a cohesive and meaningful environment is the most important. As such, something like this under /bin would just be a very confusing and not useful tool for that user. It all boils down to what the purpose of the different directories are. And I cannot agree that things under /sbin should somehow only be tools for root, or required that you are root. It is in a different directory than /bin so that it don't have to be in the path of various people. If we are not interested in that path separation/distinction then we do not need the separate directories at all. /sbin for tools that mainly system admin tools sits in, do make sense. Even hinted at, if you read sbin as system binaries. But there are just some tools that are very specific to kernel development, and that is really a kind of binaries for which I don't think we have a good place right now. /libexec is another interesting place, but I consider that to be a place where binaries that are invoked by other binaries are located. And your description of your planned use of makesyscall makes it sound like you are planning to use it directly, and not just have it as a binary invoked by other tools... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
Sorry for the top posting terse reply. On my phone as I'm in a meeting at work right now. Anyway. Who here does not modify their path at login anyway. So arguments like "I need it in bin because I am going to use it a lot" seems like weak arguments. Think of what make sense for people that don't know much. For yourself whatever the location, you should not have a hard time to get it setup to work nicely for you. No matter where the binary ends up. Really. Johnny Kamil Rytarowski skrev: (15 juni 2020 14:30:49 CEST) >On 15.06.2020 14:16, Johnny Billquist wrote: >> On 2020-06-15 14:12, Kamil Rytarowski wrote: >>> On 15.06.2020 14:11, Johnny Billquist wrote: >>>> >>>> We should not clutter the directories that are in the normal users >path >>>> with things that a normal user would never care about. >>> >>> I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. >but >>> I definitely would use makesyscall(1). If you have other argument >that >>> "I don't use it" please speak up. >> >> I'm not convinced you are particularly representative of "users". >> > >NetBSD is a my daily driver so I'm a user! > >> But it would be interesting to hear how and when you are planning to >use >> makesyscalls. >> > >I work with the syscall layer almost continuously in various projects >(debuggers, fuzzers, syscall tracers, sanitizers, non-libc language >runtimes etc). Reiterating over the same list 10 times just increases >the frustration and perception of lost time of repeating the same >process in an incompatible way for another program. The tool shall >centralize the whole knowledge about passed arguments, structs and >export it to users through a flexible code generation. > >We already distribute to users /usr/include/sys/syscalls.h (and it is >used e.g. by GDB to parse the syscalls, as parsing syscalls.master in >that case was harder). makesyscalls(1) is intended to be a more >specialized and generic version of the same functionality as >distributed >by this header. > >With some sort of fanciness, we could generate these lists on the fly >in >some projects (for e.g. GDB) and we would want the utility to be >available in place. If it is restricted to build-only phase of various >programs (that definitely shall be free from BSDSRCDIR dependency) it >will be good enough. > >I'm for adding this program in PATH and I would be a user on a regular >basis. I basically need it for pretty everything (2 GSoC ongoing >projects are about covering the same syscalls in 2 different ways). >Asking me for a use-case is odd to me as it is an elementary program >that belongs to /usr/bin. > >> Johnny >> -- Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.
Re: makesyscalls (moving forward)
On 2020-06-15 14:16, Johnny Billquist wrote: On 2020-06-15 14:12, Kamil Rytarowski wrote: On 15.06.2020 14:11, Johnny Billquist wrote: We should not clutter the directories that are in the normal users path with things that a normal user would never care about. I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. but I definitely would use makesyscall(1). If you have other argument that "I don't use it" please speak up. I'm not convinced you are particularly representative of "users". Sorry, I left out "normal" in there. :-) It should have said "normal users". And I agree, there are things in /bin or /usr/bin that I never use either. No two persons are the same, and there has to be a line drawn somewhere. But at the moment, that line seem to be drawn by some in a way that for me feels totally crazy. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 14:12, Kamil Rytarowski wrote: On 15.06.2020 14:11, Johnny Billquist wrote: We should not clutter the directories that are in the normal users path with things that a normal user would never care about. I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. but I definitely would use makesyscall(1). If you have other argument that "I don't use it" please speak up. I'm not convinced you are particularly representative of "users". But it would be interesting to hear how and when you are planning to use makesyscalls. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 14:08, Reinoud Zandijk wrote: On Mon, Jun 15, 2020 at 01:44:19PM +0200, Reinoud Zandijk wrote: As for config(1), I never understood why it is installed in /usr/bin and is called with such a generic name, but i guess thats historical. It's not that historical. And I really think it's totally wrong that it is in /usr/bin. In 2.11BSD, config is actually located in the conf directory. (But it's also just a shellscript, so a bit simpler than in NetBSD.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 12:25, Kamil Rytarowski wrote: On 15.06.2020 00:57, Johnny Billquist wrote: On 2020-06-15 00:52, Kamil Rytarowski wrote: On 15.06.2020 00:26, Johnny Billquist wrote: But that's just me. I'll leave the deciding to you guys... This is only me, but /sbin and /usr/sbin are for users with root privileges, while /bin and /usr/bin for everybody. makesyscalls(1) intends to be an end-user program that aids building software and this is just another specialized program similar to flex(1) or yacc(1), just a more domain specific code generator. Is ping only for people with root privileges??? ping needs setuid so yes. What kind of silly argument is that? I don't at all understand how people first of all can think that stuff under /sbin is for root only. Second, setuid exists for exactly the reason that non-root people should be able to run some things that requires root provileges. It seems we are starting to do the distinction on where programs should go based on whether you need to be root to run them or not, which for me is a totally crazy idea. Should passwd then move to /sbin? Just look around, there are plenty of programs under /bin and /usr/bin which are setuid root. They should *not* move into /sbin because of that. /sbin and /usr/sbin holds tools that are generally not used by others that system administrators. Tools that are often needed to bring up a system at boot time. Tools that commonly are not in normal users path. We should not clutter the directories that are in the normal users path with things that a normal user would never care about. If we basically place any kind of tool in /bin or /sbin just because it don't require people to be root, then you will have all kind of stuff there are is totally meaningless for most users, while a lot of meaningful things will be in /sbin and /usr/sbin, at which point everyone will need to have all directories in their paths, at which point there is no point in even having a separate /bin and /sbin. There has to be a reason to split binaries into different directories. A reason that have some practical meaning. Or else it's not actually useful, and instead becomes just some obscure inner circle "thing" of obscurity. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 00:07, Kamil Rytarowski wrote: On 14.06.2020 23:59, Johnny Billquist wrote: On 2020-06-14 23:21, Paul Goyette wrote: On Sun, 14 Jun 2020, David Holland wrote: This raises two points that need to be bikeshedded: (1) What's the new tool called, and where does it live in the tree? "usr.bin/makesyscalls" is fine with me but ymmv. "usr.bin/makesyscalls" sounds good to me. Uh? usr.bin is where stuff for /usr/bin is located, right? Anything there should be pretty normal tools that any user might be interested in. Don't seem to me as makesyscalls would be a tool like that? Possibly some sbin thing, but in all honestly, wouldn't this make more sense to have somewhere under sys? Don't we have some other tools and bits which are specific for kernel and library building? /usr/bin is appropriate and there are already similar tools (like ioctlprint(1)). It's already in PATH and definitely in interest of some end-users (like me) and I do want to have it. It could certainly be questioned if ioctlprint should be in /usr/bin as well. If we think tools like ping, which arguably a lot of people have heard about, and actually use, are in /sbin, what makes ioctlprint such a more commonly looked for, and used tool? I would say this is really a tool for pretty advanced users. But even so, I could more easily see ioctlprint in /usr/bin than I could makesyscalls. Anyone writing code that sits in the kernel is definitely in the area of users who have privileges and abilities that don't apply to normal users. And the question was not about having it or not, but where it should be located. Looking at hier(7), we have: /sbin/ System programs and administration utilities used in both single-user and multi-user environments. /usr/sbin/ System daemons and system utilities (normally executed by the super-user). and /bin/ Utilities used in both single and multi-user environments. /usr/bin/ Common utilities, programming tools, and applications. How "common" would you say makesyscalls is (or ioctlprint for that matter)? I don't mind the name, but I also agree that this is mostly something for build.sh, which I am wondering if it wouldn't more appropriately fit in somewhere under sys? Definitely not something I would expect a normal user to ever make use of. But that's just me. I'll leave the deciding to you guys... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 00:50, Robert Swindells wrote> Johnny Billquist wrote: On 2020-06-14 23:21, Paul Goyette wrote: On Sun, 14 Jun 2020, David Holland wrote: This raises two points that need to be bikeshedded: (1) What's the new tool called, and where does it live in the tree? "usr.bin/makesyscalls" is fine with me but ymmv. "usr.bin/makesyscalls" sounds good to me. Uh? usr.bin is where stuff for /usr/bin is located, right? Anything there should be pretty normal tools that any user might be interested in. Don't seem to me as makesyscalls would be a tool like that? As config(1) is in /usr/bin that seems the best place for makesyscalls too. Ouch! What a rabbit hole! I should be quiet now. :-) (I don't really think config makes any sense at all to have in /usr/bin... ;-) ) I would expect that the generated files would have the developer uid and gid, I wouldn't want them owned by root. I guess I fail to see the problem there. That all depends on who is running it, and where the files are placed. Don't really matter where the tool is located. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-15 00:52, Kamil Rytarowski wrote: On 15.06.2020 00:26, Johnny Billquist wrote: But that's just me. I'll leave the deciding to you guys... This is only me, but /sbin and /usr/sbin are for users with root privileges, while /bin and /usr/bin for everybody. makesyscalls(1) intends to be an end-user program that aids building software and this is just another specialized program similar to flex(1) or yacc(1), just a more domain specific code generator. Is ping only for people with root privileges??? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: makesyscalls (moving forward)
On 2020-06-14 23:21, Paul Goyette wrote: On Sun, 14 Jun 2020, David Holland wrote: This raises two points that need to be bikeshedded: (1) What's the new tool called, and where does it live in the tree? "usr.bin/makesyscalls" is fine with me but ymmv. "usr.bin/makesyscalls" sounds good to me. Uh? usr.bin is where stuff for /usr/bin is located, right? Anything there should be pretty normal tools that any user might be interested in. Don't seem to me as makesyscalls would be a tool like that? Possibly some sbin thing, but in all honestly, wouldn't this make more sense to have somewhere under sys? Don't we have some other tools and bits which are specific for kernel and library building? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: NULL pointer arithmetic issues
On 2020-02-25 12:33, Tom Ivar Helbekkmo wrote: Johnny Billquist writes: But yes, on the PDP11 [having nothing mapped at address 0] was/is not the case. Memory space is too precious to allow some of it to be wasted for this... Yup - and I assume the "hack" Kamil alludes to is the practice of actually starting the data segment for split I/D programs at address 1 instead of 0, to make sure that no actual pointer is 0, thus allowing the straightforward comparison of a pointer with 0 to see if it's set. Well, the d-dpace don't start at 1, and also, the PDP-11 isn't that fond of odd addresses. :-) Actually, you could not even start a page at address 1 if you wanted. (I believe they also initialized address 0 to 0, to stop indirect references through it from reaching random data. I guess Franz may have depended on this in some way, e.g. expecting to be able to test *p directly, instead of first p and then *p. Do enough of this, and you've soon bummed a significant amount of valuable code space...) It used to, but not for some time now. Here is the current "state": /* * Paragraph below retained for historical purposes. * * The following zero has a number of purposes - it serves as a null terminated * string for uninitialized string pointers on separate I machines for * instance. But we never would have put it here for that reason; programs * which use uninitialized pointer *should* die. The real reason it's here is * so you can declare "char blah[] = "foobar" at the start of a C program * and not have printf generate "(null)" when you try to print it because * blah is at address zero on separate I machines ... sick, sick, sick ... * * In porting bits and pieces of the 4.4-Lite C library the global program * name location '___progname' was needed. Rather than take up another two * bytes of D space the 0th location was used. The '(null)' string was * removed from doprnt.s so now when programs use uninitialized pointers * they will be rewarded with argv[0]. This is no sicker than before and * may cause bad programs to die sooner. */ .data .globl ___progname, _strrchr ___progname: 0 Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: NULL pointer arithmetic issues
On 2020-02-25 02:12, Mouse wrote: Oh. And I actually do not believe it has to be a constant. You are correct; it does not need to be a simple constant. The text says "integer constant expression with the value 0, or such an expression..." Yes. (void *)(1-1) is a valid null pointer constant. So, on an all-ASCII system, is (0x1f+(3*5)-'.'). But, in the presence of int one(void) { return(1); } then (one()-one()) is not - it is an integer expression with value zero, but it is not an integer _constant_ expression. It's entirely possible that (int *)(one()-one()) will produce a different pointer from (int *)(1-1) - the latter is a null pointer; the former might or might not be, depending on the implementation. As you say, it's an integer expression. And I read that "or" part as just an expression, which this is. So I believe it is a valid way to creating something that can be converted to a NULL pointer. Also: if (expression) statement; shall execute statement if expression not equals 0, according to the standard. So, where does that leave this code: char *p; . . if (p) foo(); p is not an integer. How do you compare it to 0? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: NULL pointer arithmetic issues
On 2020-02-24 21:24, Kamil Rytarowski wrote: On 24.02.2020 21:18, Mouse wrote: If we use 0x0, it can be a valid pointer. If we use NULL, it's not expected to work and will eventually generate a syntax erro. Then someone has severely broken compatability with older versions of C. 0x0 and (when one of the suitable #includes has been done) NULL have both historically been perfectly good null pointer constants. Also...syntax error? Really? _Syntax_ error?? I'd really like to see what they've done to the grammar to lead to that; I'm having trouble imagining how that would be done. The process of evaluation of the NULL semantics is not a recent thing. Not so long time, still in the NetBSD times, it was a general practice to allow dereferencing the NULL pointer and expect zeroed bytes over there. We still maintain compatibility with this behavior (originated as a hack in PDP11) in older NetBSD releases (NetBSD-0.9 Franz Lisp binaries depend on this). Really? I thought we usually do not have anything mapped at address 0 to explicitly catch any dereferencing of NULL pointers. But yes, on the PDP11 this was/is not the case. Memory space is too precious to allow some of it to be wasted for this... (Even if there are a comment about it in 2.11BSD, bemoaning this fact...) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: NULL pointer arithmetic issues
On 2020-02-25 00:24, Johnny Billquist wrote: On 2020-02-24 23:35, Mouse wrote: Unless I remember wrong, older C standards explicitly say that the integer 0 can be converted to a pointer, and that will be the NULL pointer, and a NULL pointer cast as an integer shall give the value 0. The only one I have anything close to a copy of is C99, for which I have a very late draft. Based on that: You are not quite correct. Any integer may be converted to a pointer, and any pointer may be converted to an integer - but the mapping is entirely implementation-dependent, except in the integer->pointer direction when the integer is a "null pointer constant", defined as "[a]n integer constant expression with the value 0" (or such an expression cast to void *, though not if we're talking specifically about integers), in which case "the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function". You could have meant that, but what you wrote could also be taken as applying to the _run-time_ integer value 0, which C99's promise does not apply to. (Quotes are from 6.3.2.3.) I don't think there is any promise that converting a null pointer of any type back to an integer will necessarily produce a zero integer. Maybe we are reading things differently...? Looking at 6.3.2.3... As far as I read, paragraph 3 says: "An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.55) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function." Essentially, the integer constant 0 can be casted to a pointer, and that pointer is then a null pointer constand, also called a null pointer. And footnote 55 says: Oh. And I actually do not believe it has to be a constant. The text says "integer constant expression with the value 0, or such an expression..." So either a constant expression, or just an expression, which gives a 0, can be cast to a pointer, that that will be the NULL pointer. (I realized when reading, that I might have implied that it only applied to constanst, which I think it does not.) But I might have misunderstood everything, of course... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: NULL pointer arithmetic issues
On 2020-02-24 23:35, Mouse wrote: Unless I remember wrong, older C standards explicitly say that the integer 0 can be converted to a pointer, and that will be the NULL pointer, and a NULL pointer cast as an integer shall give the value 0. The only one I have anything close to a copy of is C99, for which I have a very late draft. Based on that: You are not quite correct. Any integer may be converted to a pointer, and any pointer may be converted to an integer - but the mapping is entirely implementation-dependent, except in the integer->pointer direction when the integer is a "null pointer constant", defined as "[a]n integer constant expression with the value 0" (or such an expression cast to void *, though not if we're talking specifically about integers), in which case "the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function". You could have meant that, but what you wrote could also be taken as applying to the _run-time_ integer value 0, which C99's promise does not apply to. (Quotes are from 6.3.2.3.) I don't think there is any promise that converting a null pointer of any type back to an integer will necessarily produce a zero integer. Maybe we are reading things differently...? Looking at 6.3.2.3... As far as I read, paragraph 3 says: "An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.55) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function." Essentially, the integer constant 0 can be casted to a pointer, and that pointer is then a null pointer constand, also called a null pointer. And footnote 55 says: "The macro NULL is defined in (and other headers) as a null pointer constant; see 7.17." So, 0 casted as a pointer gives a NULL pointer. And paragraph 6 says: "Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type." And I can only read the "previously specified" to refer to the equivalence between a NULL pointer and integer 0, because nothing before paragraph 6 talks about pointer to integer, so I can't see how it can be read as something more specific than all the things mentioned in the prebious 6 paragraphs. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: NULL pointer arithmetic issues
On 2020-02-24 21:18, Mouse wrote: If we use 0x0, it can be a valid pointer. If we use NULL, it's not expected to work and will eventually generate a syntax erro. Then someone has severely broken compatability with older versions of C. 0x0 and (when one of the suitable #includes has been done) NULL have both historically been perfectly good null pointer constants. Also...syntax error? Really? _Syntax_ error?? I'd really like to see what they've done to the grammar to lead to that; I'm having trouble imagining how that would be done. Unless I remember wrong, older C standards explicitly say that the integer 0 can be converted to a pointer, and that will be the NULL pointer, and a NULL pointer cast as an integer shall give the value 0. This is also used in such things as code traversing linked lists to check for the end of the list... And the C standard also explicitly allows the NULL pointer to not be represented by something with all bits cleared. Only that casting to/from integers have a very defined behavior. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: alloca() in kernel code
On 2019-10-13 01:18, Joerg Sonnenberger wrote: On Sun, Oct 13, 2019 at 12:46:24AM +0200, Johnny Billquist wrote: But if you use alloca(), you will have to check what size you'd like to allocate, and not allocate more than some maximum amount, I would assume. Or do you really think that it is ok to just let it try no matter what amount is decided you want to allocate? And if you figure out an upper limit, then you might as well just define an array of that size in the function, and be done with it. All nice and good, but it doesn't help with the original problem. How to deal with dynamically sized data when there is no dynamic allocator. Without context, it is impossible to know if "dynamic size" means a reasonable sized string, a list of memory segments etc. As such, it is impossible to say if alloca (or just defining a fixed size array or whatever) is reasonable or not. I don't agree. No matter if we can tell what dynamic size means. We cannot allow allocation of arbitrary sized data on the stack. We know there is an upper limit before the system will crash. So either we immediately go for that limit (if we think we need that much), or we try to play dynamically, with a limiter at that limit. Plying it "dynamically" with a limiter then does not really gain us anything, so why do it? That was my point. Dynamic data in a kernel is problematic. Always. You always will have limitations on it, which are more restrictive than anything people write in user land. alloca() don't change that. If a fixed memory chunk isn't good enough, then alloca() will not really be any better, since it can't give you more anyway. It just does it (slightly) differently. If we need more memory than can be allocated on the stack, then you need to come up with other solutions. Just replacing a static size allocation with a call to alloca() don't solve any problem. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: alloca() in kernel code
On 2019-10-12 20:47, Joerg Sonnenberger wrote: On Sat, Oct 12, 2019 at 08:13:25PM +0200, Johnny Billquist wrote: On 2019-10-12 19:01, Emmanuel Dreyfus wrote: Mouse wrote: I'm presumably missing something here, but what? I suspect Maxime's concern is about uncontrolled stack-based variable buffer, which could be used to crash the kernel. But in my case, the data is coming from the bootloader. I cannot think about a scenario where it makes sense to defend against an attack from the bootloader. The kernel already has absolute trust in the bootloader. On this one, I agree with Maxime. Even if it comes from the bootloader, why would you want to use alloca()? Because as Emmanuel wrote initially, dynamic allocations might not be possible yet. But if you use alloca(), you will have to check what size you'd like to allocate, and not allocate more than some maximum amount, I would assume. Or do you really think that it is ok to just let it try no matter what amount is decided you want to allocate? And if you figure out an upper limit, then you might as well just define an array of that size in the function, and be done with it. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: alloca() in kernel code
On 2019-10-12 19:01, Emmanuel Dreyfus wrote: Mouse wrote: I'm presumably missing something here, but what? I suspect Maxime's concern is about uncontrolled stack-based variable buffer, which could be used to crash the kernel. But in my case, the data is coming from the bootloader. I cannot think about a scenario where it makes sense to defend against an attack from the bootloader. The kernel already has absolute trust in the bootloader. On this one, I agree with Maxime. Even if it comes from the bootloader, why would you want to use alloca()? It's not that it uses the stack, per se. That is obviously not the problem. The problem is that it can allocate arbitrary large amounts of data. And it is not, as Mouse suggested, a case of optimizing memory use. If we get into a function, any memory allocated by alloca() only have the lifetime of the executing function. But allowing arbitrary large chunks of data be allocated here is bad, since it can cause crashes is many silly ways. But any local variable in the function have the same property about lifetime, and have the same property about optimizing memory use, since when that function is not active, there is no memory used. And if we then know of an upper limit to what we would allow to be allocated through alloca(), why don't we just have a function local variable of that size already from the start. You are not going to use more memory than the worst case acceptable anyhow, which any called function still have to be able to run with (meaning how much stack is left). So what is the gain of using alloca()? You might be using less stack on a good day, but we already know the stack can handle the worst case anyhow, and this is temporary stuff, so there is honestly close to no gain from alloca() (you could possibly argue more locality benefits for values, such as hitting the same cache lines if the alloca() would be grabbing very little memory, but if you are trying to optimize for that detail, then you're pretty lost, since this varies a lot per CPU model and variant that you cannot do this is a sane way unless you are building very CPU specific code with a ton of constraints. What do we loose by using alloca()? In the best case nothing. In the worse case, you have created new vulnerabilities. So, again, why would you ever use alloca()? It's only a potential bad idea, with no real gains here. We cannot allow arbitrary size allocations anyhow, and for the limited size allocations, we must be sure we always work in the worse case, so why not do that from the start, and have a (more) sane situation. alloca() is just bad here. (And potentially bad in general.) And, sadly, C 11 also added a hidden alloca() into the base language itself, making life more miserable in general. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Removing PF
On 2019-04-01 15:17, Jaromír Doleček wrote: Le lun. 1 avr. 2019 à 14:32, Stephen Borrill a écrit : Your two statements are mutually inconsistent: 1) No-one is maintaining ipf or pf and 2) If the effort had been on one firewall instead of three, the one chosen would be more functional. IMO it's consistent - if we had one, it would be clear to which one to contribute, and clear if the feature is really missing and working. i.e. nobody contributes anything because they don't know to which of the firewalls to contribute. It is obviously inconsistent. You can't both claim that no-one is maintaining it, and claim people putting some effort into it (with "it" being either pf or ipf). Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Removing PF
On 2019-04-01 15:16, Emmanuel Dreyfus wrote: On Sat, 30 Mar 2019, Maxime Villard wrote: 2) If the effort had been on one firewall instead of three, the one chosen would be more functional. Well, I cannot tell for PF, but IPF is functionnal, I use it a lot and I am not alone It may have bugs, but if you really have to remove it, please make sure there is an easy migration path. Yeah. I happen to use ipf as well. Anecdotal it would appear that npf might be the least tested or used option... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Removing PF
On 2019-03-30 13:13, BERTRAND Joël wrote: Maxime Villard a écrit : Hum, wait a minute in here. Am I being totally dumb, or NPF does support ftp-proxy since 2011? See src/dist/pf/usr.sbin/ftp-proxy/npf.c, and the associated commit message. There is some pieces of code to try to support NPF in ftp-proxy. But this code doesn't work as expected (and some code is between #if 0/#endif). I have tried to fix this code without success. Maybe I have misunderstood some kernel NPF stuff, but my modified ftp-proxy crashed kernel. Are you saying that NPF is not being updated or supported? Should we maybe remove it? Having broken code around for 8 years without anyone even much noticing (except you maybe) or fixing it would suggest the code is rather rotten, and little used. Only half a :-) intended. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 20:11, Jason Thorpe wrote: On Mar 16, 2019, at 11:29 AM, Johnny Billquist wrote: As for me personally, yes, I am certainly guilty of mostly making noise, and few contributions. I used to do a bit more, but mostly on VAX specific stuff. But since other were making changes all the time, making the VAX port less and less usable, I instead stopped trying to fix things. So maybe I should just leave/fork/whatever. That is I guess, the way I should view all of this. The NetBSD project even publicly states on the ports page (https://www.netbsd.org/ports/) that some platforms are more equal than others, and that it's the responsibility of those who care deeply about a non-Tier 1 platform to keep it in tip-top shape. This is not an unreasonable position. I know it states so. When that happened was one of those moments where my engagement dropped as well. But, consider your VAX example... even though there is a very good emulation environment available for VAX, the project is partially hamstrung by factors not necessarily under its control, e.g. the quality of compilers available for the VAX. As I recently discovered while trying to do my own due diligence widely testing a set of cross-platform changes, C++ exceptions don't work on the VAX at all right now, so the ATF tests can't be run. Perhaps someone who cares deeply about the VAX ought to fix the situation. But if no one steps up, then it's fair to assume that no one in fact cares deeply about the VAX[*], and thus spending human productivity on it is not the best allocation of resources. Broken tools is definitely a big problem. But the fact that doing a full build have grown from a bit over one day to at least two weeks on an 8650 certainly didn't help either... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 19:59, Robert Elz wrote: Date:Sat, 16 Mar 2019 16:53:16 +0100 From:Maxime Villard Message-ID: <7acc19dd-9f66-f825-d517-6e7013de1...@m00nbsd.net> | if they don't subscribe, it's their problem, Really? Is that the same attitude you have everywhere? If your local council (or whatever the equivalent is in France) closes your street, and when you object says "we sent a request on our mailing list, no-one objected, and it is too late now, it is done" do you think you are to blame because you're not on their mailing list?Or if any other prodct that you use (and NetBSD is a product) changes after sending an announcement and request for objections to their mailing list? Que the Hitchhikers Guide... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 19:25, Jason Thorpe wrote: On Mar 16, 2019, at 6:43 AM, Johnny Billquist wrote: Make it work - don't remove it. That's rich. With some of the things we're talking about, "make it work" (or "keep it working") is a very resource-intensive proposition. Hey, I have an idea... since you care so much about it, why don't YOU step up and "make it work" / "keep it working". Seriously, at some point, those who care about these things so much need to be putting in the effort to keep them going. It's completely unfair to others working on the rest of there system to place the entire burden on them, especially when we're talking about things that are, by their nature, niche applications that require special resources. Fair enough viewpoint. To which my response is, then state this openly and clearly. And then people can decide if they want to run NetBSD or if they should look elsewhere. As for me personally, yes, I am certainly guilty of mostly making noise, and few contributions. I used to do a bit more, but mostly on VAX specific stuff. But since other were making changes all the time, making the VAX port less and less usable, I instead stopped trying to fix things. So maybe I should just leave/fork/whatever. That is I guess, the way I should view all of this. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 17:01, Mouse wrote: [...] It's a little like the modern mania for cross-building. It helps, but only if you/we don't forget that it's only a rough approximation. How long was it VAX was broken because there was something wrong that showed up on native builds but not cross-builds? People got used to thinking that because it cross-built, it was fine. It's one reason I insist on self-hosting on all my machines. You mean native builds on VAX now works again? When did that happen? :-) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 14:28, Maxime Villard wrote: I stated my point clearly and logically, about why certain things have legitimate reasons to go away, regardless of whether they are compat layers, or drivers, or something else. Rather than giving clear, logical counter arguments, you are just repeating "XXX is YYY years old, remove it". You are going to have to provide better arguments I'm afraid, because you're not going to convince _*anyone*_ with that. I think that what Robert, and others (including me) argument is actually that things should not be removed, and the reason would be that this is the core mission, purpose, reason (or whatever you want to call it) for NetBSDs existence. Instead it should be fixed, because that is what it all is about. Make it work - don't remove it. And what you are arguing for would imply a change of what the reason for existence is. Now, like I said before, maybe we should have a discussion then about what the reason for NetBSD is. (And Robert certainly have me convinced, so your last statement is provably false. Don't try to speak for everyone.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 13:37, Michael Kronsteiner wrote: On Sat, 2019-03-16 at 18:17 +0700, Robert Elz wrote: Date:Sat, 16 Mar 2019 09:45:07 +0100 From:Maxime Villard Message-ID: NetBSD can support newer hardware at the OS level, and old userland, which doesn't care what the hardware underneath is in any detail (just i see. would YOU run software that costs maybe 4-digit numbers for yearly license on an OS that "maybe" runs it "somehow" ? why do you think prices for those old machines have skyrocketed? they run NATIVE. because it WORKS (tm) And "newer" Hardware is quite relative to stay polite. theres no newer alpha or vax or decstation than the existing models. not even all of them are fully supported(just check the webpages port-pmax, port-vax, port-alpha). btw even digital unix runs fine and stable on MP alphas. maybe "primary targets" should be a bit rearranged. last release of tru64 was in 1999 or so aswell. to support that is not really on my necessary wishlist for now. for the time being - if i need an ultrix or osf(dec unix, tru64...) setup, i will install exactly that. it seems what youre trying to do is to shoehorn is like stuffing a 2n3055 in modern microelectronics. i think this compatibility stuff can be kept for later. first get a stable base. development on that side has very improved in last years. its often better to stay "on track" for now. its sad to say that directly, but as businessman, NetBSD is not an option for now. i want more money, not more work with unstable machines. (yes im greedy) i excuse to everyone who found my words a bit harsh, but thats reality. Well, if we want to talk reality - why are you even looking at NetBSD. Reality, business wise, is Linux. You might possibly argue FreeBSD. NetBSD is not an option, and will never become an option. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 09:45, Maxime Villard wrote: Le 16/03/2019 à 06:23, John Nemeth a écrit : On Mar 15, 10:31pm, Michael Kronsteiner wrote: } } i have this discussion today aswell... considering 64/32bit machines. } if you want ultrix, install ultrix. if you want osf1/dec unix/tru64 } install that. being able to run ummm nearly 20 year old binaries... } well. if thats what you want be prepared for a ride. i never ran } "foreign" binaries on a BSD. and i often compile myself even on more } "user friendly" systems. By any chance, have you seen our About page: http://www.netbsd.org/about/ ? The second paragraph reads thus: - One of the primary focuses of the NetBSD project has been to make the base OS highly portable. This has resulted in NetBSD being ported to a large number of hardware platforms. NetBSD is also interoperable, implementing many standard APIs and network protocols, and emulating many other systems' ABIs. - Emulating other systems is fundamental to what NetBSD is about. This is a really simplistic answer. It is not difficult to see that our website does not reflect reality at all. So, what is the reality then, in your opinion? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 11:50, Maxime Villard wrote: Le 16/03/2019 à 11:26, Johnny Billquist a écrit : If the answer is that we remove the code, then indeed, the whole webpage is incorrect, and we should change it to state that we do not try to be interoperable, implementing many standard APIs, or care about other platforms. There seems to be some confusion here. Being "interoperable" doesn't mean "keeping unmaintained code that is so broken it can't even emulate a dumbass mmap". (Thinking about SVR4 here.) I don't see any confusion. My question was about what NetBSD is about. If we are serious about being interoperable, implementing many standard APIs, and care about different platforms, then buggy or broken code should be fixed. Or at worst left for someone else to fix later. If we don't care about the interoperability, then indeed, we delete the code. If that's your definition of interoperable, then indeed, we used to be - and still are a bit, actually - a very, very interoperable system. Do you mean that deleting the code is your definition of interoperable? Regarding the web site, if my memory is correct, not too long ago we did change the page to put less emphasis on the compat layers, but the changes were backed out because one person wanted to keep the old version; if my memory is correct, this person was John Nemeth. I don't care who did what. I think it's time to maybe agree on what NetBSD is about. If people disagree, then either there should be an attempt at reaching a consensus, or else fork. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Regarding the ULTRIX and OSF1 compats
On 2019-03-16 10:24, Maxime Villard wrote: Le 16/03/2019 à 10:12, Johnny Billquist a écrit : On 2019-03-16 09:45, Maxime Villard wrote: Le 16/03/2019 à 06:23, John Nemeth a écrit : By any chance, have you seen our About page: http://www.netbsd.org/about/ ? The second paragraph reads thus: - One of the primary focuses of the NetBSD project has been to make the base OS highly portable. This has resulted in NetBSD being ported to a large number of hardware platforms. NetBSD is also interoperable, implementing many standard APIs and network protocols, and emulating many other systems' ABIs. - Emulating other systems is fundamental to what NetBSD is about. This is a really simplistic answer. It is not difficult to see that our website does not reflect reality at all. So, what is the reality then, in your opinion? The reality is that our compat layers are largely unmaintained and broken. One would have to be very dumb to believe that this constitutes a USP. But that don't necessarily make the webpage incorrect. The question then is what we do when something is broken. If the answer is that we remove the code, then indeed, the whole webpage is incorrect, and we should change it to state that we do not try to be interoperable, implementing many standard APIs, or care about other platforms. Else we should instead fix broken code, and keep the statements on the web-page. Just saying that something is broken does not really answer what NetBSD is about. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: svr4, again
Unless I remember wrong, it's needed for Ultrix compatibility on VAX (and maybe MIPS?). Johnny On 2018-12-19 19:46, Brian Buhrow wrote: hello. The COMPAT_IBCS2 is for SCO OpenServer and Unixware binary compatibility. I used it regularly to run Oracle database clients and servers on NetBSD which were compiled for OpenServer. OpenServer hasn't changed much in the years since I did that, but is anyone really using OpenServer anymore? And, more aptly, is any company still building software for OpenServer? It may be the case that someone is still using some old software for OpenServer, but it's probably a legacy system that can be run as a VM going forward to protect it from hardware changes which OpenServer is never going to support. In short, COMPAT_IBCS2 worked well, but it too, should probably go, again with the proviso that instructions be left around on how one might use it if one really needs it. -thanks -Brian On Dec 19, 5:32pm, Maxime Villard wrote: } Subject: Re: svr4, again } Le 19/12/2018 à 13:46, Maxime Villard a écrit : } While I'm at it, there was a conversation about compat_ibcs2 [1]. See } the thread, basically it is a compat for SVR3 on Vax. It seems that no } one knew exactly what was the purpose of it. } } Maybe something we could retire as well (even though it likely isn't } as broken as compat_svr4). Don't know, throwing this here in case } someone cares. } -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Missing compat_43 stuff for netbsd32?
On 2018-09-12 00:06, Paul Goyette wrote: On Wed, 12 Sep 2018, Johnny Billquist wrote: On 2018-09-11 23:06, Paul Goyette wrote: On Tue, 11 Sep 2018, Johnny Billquist wrote: Well, how about running actual BSD 4.3 binaries? :-) But this is obviously limited to VAX only. We don't have a compat_netbsd32 for vax. We have the module only for amd64, mips, and arm. I guess that is correct, but this was about COMPAT_43, not something_32. Or did I miss something? It was specifically about COMPAT_43 _under_ COMPAT_NETBSD :) D'oh! My fault then. Sorry. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Missing compat_43 stuff for netbsd32?
On 2018-09-11 23:06, Paul Goyette wrote: On Tue, 11 Sep 2018, Johnny Billquist wrote: Well, how about running actual BSD 4.3 binaries? :-) But this is obviously limited to VAX only. We don't have a compat_netbsd32 for vax. We have the module only for amd64, mips, and arm. I guess that is correct, but this was about COMPAT_43, not something_32. Or did I miss something? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: Missing compat_43 stuff for netbsd32?
On 2018-09-11 17:35, Eduardo Horvath wrote: On Tue, 11 Sep 2018, Paul Goyette wrote: While working on the compat code, I noticed that there are a few old syscalls which are defined in syc/compat/netbsd323/syscalls.master with a type of COMPAT_43, yet there does not exist any compat_netbsd32 implementation as far as I can see... #64 ogetpagesize #84 owait #89 ogetdtablesize #108osigvec #142ogethostid (interestingly, there _is_ an implementation for osethostid!) #149oquota Does any of this really matter? Should we attempt to implement them? I believe COMPAT_43 is not NetBSD 4.3 it's BSD 4.3. Anybody have any old BSD 4.3 80386 binaries they still run? Did BSD 4.3 run on an 80386? Did the 80386 even exist when Berkeley published BSD 4.3? I would say it'd most definitely BSD 4.3. And no, it did not run on 386. The first BSD version to run on anything besides VAX was 4.3 Tahoe, unless I remember wrong, if we talk about actual BSD. There were forks, though. Such as SunOS. Which were done even earlier than 4.3. It's probably only useful for running ancient SunOS 4.x binaries, maybe Ultrix, Irix or OSF-1 depending on how closely they followed BSD 4.3. Well, how about running actual BSD 4.3 binaries? :-) But this is obviously limited to VAX only. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: mmap implementation advice needed.
On 2018-03-30 22:31, Joerg Sonnenberger wrote: On Fri, Mar 30, 2018 at 04:22:29PM -0400, Mouse wrote: And I (and ragge, I think it was) misspoke. It doesn't quite require 128K of contiguous physical space. It needs two 64K blocks of physically contiguous space, both within the block that maps system space. (Nothing says that P0 PTEs have to be anywhere near P1 PTEs in system virtual space, but they do have to be within system space.) ...and the problem to be solved here is that the memory has become fragmented enough that you can't find 64KB of contiguous pages? If so, what about having a fixed set of emergency reservations and copying the non-contiguous pmap content into that during context switch? I don't think that was Ragge's problem. The problem was/is (if I understood it right), that someone can mmap more than 1G, and that will never be possible to map on the VAX. The P0 space is only 1G, and the same is true for the P1 space. But P0 and P1 is disjunct, so don't try to think of them as contiguous space. So, anything trying to grab more than 1G will never be possible. But it would appear that the MI part don't give any hooks to stop a process from going just that. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: /proc/#/ctl removal
On 2017-08-27 14:09, D'Arcy Cain wrote: On 08/27/2017 03:59 AM, Christos Zoulas wrote: LGTM, perhaps leave a comment /* old P_FSTRACE 0x0001 */ instead of completely removing the constants for now as a reminder. Isn't that sort of duplicating what CVS does? I would say no. CVS allows you to go back in history, see what changed, see how things were before, and so on. Documenting something in the code is useful for people who are writing code. It would be impossible to always go back and check the full history of every file when you are doing work. If there is something that is useful for the future to be aware of, it needs to be documented in the code, including if it is something of related to history. If it is totally irrelevant for future code writing, then there is no need to keep any comment in the code, but if, for example, some constant of value in a larger range historically was used for something, this is important to keep around, even if it is no longer used, as it should maybe be left unused/undefined, and if you need some new value, you should grab a different one. If you get what I mean. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: /proc/#/ctl removal
On 2017-08-27 23:20, Kamil Rytarowski wrote: On 27.08.2017 16:07, Johnny Billquist wrote: On 2017-08-27 14:09, D'Arcy Cain wrote: On 08/27/2017 03:59 AM, Christos Zoulas wrote: LGTM, perhaps leave a comment /* old P_FSTRACE0x0001 */ instead of completely removing the constants for now as a reminder. Isn't that sort of duplicating what CVS does? I would say no. CVS allows you to go back in history, see what changed, see how things were before, and so on. Documenting something in the code is useful for people who are writing code. It would be impossible to always go back and check the full history of every file when you are doing work. If there is something that is useful for the future to be aware of, it needs to be documented in the code, including if it is something of related to history. If it is totally irrelevant for future code writing, then there is no need to keep any comment in the code, but if, for example, some constant of value in a larger range historically was used for something, this is important to keep around, even if it is no longer used, as it should maybe be left unused/undefined, and if you need some new value, you should grab a different one. If you get what I mean. Johnny I use git mirror for data mining. [...] You can do similar with any revision control system. But data mining is a different thing than writing code and knowing about details from past history. Which was my point. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: nanosleep() for shorted than schedule slice
On 2017-07-03 07:25, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: Having the normal wall clock driven by a tick interrupt has its points. We usually avoid this and use what hardware timer the platform offers. Which is the HZ interrupt, unless I'm confused. And that drives the wall clock, with all the additional bells and whistles of clock adjustments to make sure the clock normally is monotonic, and is just the number of seconds since epoch. It's just that for high resolution timers, ticks are not a good source. For anything else, they are just fine. So why conflate the two? Because you don't need two clock interrupts. The regular interrupt is just another event that happens to be scheduled in a regular interval. Right. It would mean having two clock interrupts. Need and need. There are lots of things you don't need, but which might make things more convenient. But, as I said before, I won't object to any implementation. I was just objecting to the argument that tickless was a requirement for getting high precision timers, which it is not. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: nanosleep() for shorted than schedule slice
On 2017-07-03 01:10, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: A tickless kernel wouldn't run callouts from the regular clock interrupt but would use a hires timer to issue interrupts at arbitrary times. The callout API could then be changed to either accept timespec values or just fake a much higher HZ value. Right. Not that I believe this have to be tied into tickless, but I suspect it might be easier to do it if we go tickless. Well, "not using a regular clock interrupt" is what "tickless" means. One does not exclude the other. We really should be able to deal with shorter times, even if we have ticks. That's a contradiction. "ticks" means that timed events are based on a regular clock interrupt. Of course you can speed up the ticks (e.g. Alpha uses HZ=1000), but that has other disadvantages. N.B. going tickless isn't difficult, it's just lots of work as it needs MD support on all platforms. There is no conflict in having both. Some things can run based on a regular tick, while some things are based on hardware timer that interrupts when the specified time have passed. That would, by pretty much all definitions I know, still classify as a tick-based system. It's pretty much just that we have two clock sources. Which is nothing new, really. And having some things driven by interrupt events is already the norm for non-clock based events. I believe most hardware already supports such an idea as well. One low resolution clock which generates interrupt at something like 100Hz, and one high resolution clock, which can be programmed to whatever (short) time in the future to generate an interrupt. Having the normal wall clock driven by a tick interrupt has its points. And having things like preemptive context switches based on ticks is pretty much what you want anyway. With a tickless kernel, you'll have to set up those ticks anyway. It's just that for high resolution timers, ticks are not a good source. For anything else, they are just fine. So why conflate the two? (Rethorical question, and I won't object if someone just implements a tickless kernel as well, but it is not as trivial as hinted. You probably will end up with having such a system run a tick anyway, unless you want to start calculating wall clock time in all kind of places under various circumstances, where wall clock is used. But of course, it's all doable.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: nanosleep() for shorted than schedule slice
On 2017-07-02 23:24, Michael van Elst wrote: b...@softjar.se (Johnny Billquist) writes: I don't get it. What was the problem with using nanosleep for short usleep's? usleep is just a wrapper around nanosleep. There is no difference except that nanosleep accepts higher precision delays. So, the comment previously that it was not ok to use nanosleep for short usleeps seems to not hold much water. The kernel computes the number of ticks to sleep, schedules a callout that will wake the thread up and calls the scheduler to run other threads in the meantime. The callout is later dispatched by the clock interrupt. So you always have to wait for at least one tick, with HZ=100 that's 10ms. Right. With the current implementation. And which means that "higher precision delays" are just imaginary. Our implementation does not actually give us any higher precision delays. A tickless kernel wouldn't run callouts from the regular clock interrupt but would use a hires timer to issue interrupts at arbitrary times. The callout API could then be changed to either accept timespec values or just fake a much higher HZ value. Right. Not that I believe this have to be tied into tickless, but I suspect it might be easier to do it if we go tickless. I think it's pretty bad that our timers only have the precision of ticks, even if people call something like nanosleep. By the way, the fact that nanosleep also in the end uses ticks means that the OP suggestion of using nanosleep for short usleeps will not really improve the situation at all. We really should be able to deal with shorter times, even if we have ticks. But I guess it all depends on the hardware (which it does if we go to tickless as well, though.) Part of my confusion might also be slightly different uses of the term tickless. The kernels use of ticks for time keeping and some work does not necessarily force us to use them for callouts. It's just convenient. We could (maybe should?) use a different mechanism for the callout subsystem. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: nanosleep() for shorted than schedule slice
On 2017-07-02 18:54, John Nemeth wrote: On Jul 2, 1:16pm, Christos Zoulas wrote: } In article <1n8j63y.1pcs0owrn6gcem%m...@netbsd.org>, } Emmanuel Dreyfus <m...@netbsd.org> wrote: } > } >I just encountered a situation where PHP performance on NetBSD is rather } >weak compared to Linux or MacOS X. } > } >The code calls PHP's uniqid() a lot of time. uniqid() creates an unique } >id based on the clock. In order to avoid giving the same value for two } >consecutive calls, PHP's uniqid() calls usleep(1) to skip to make sure } >the current microsecond has changed. } > } >On NetBSD this turns into a 16 ms sleep, which is 16000 what was } >requested. This happens because the kernel scheduled another process, } >which is the behavior documented in the man page. However the result is } >that a PHP script full of uniqid() is ridiculously slow. } > } >I worked around the problem by reimplementing PHP uniqid() using } >uuidgen(), but that kind of performance problem could exist in many } >other softwares. } > } >I wonder if it would make sense for nanosleep(2) to check that requested } >sleeping time is shorter than a schedule slice, and if it is, spin the } >CPU instead of scheduling another process. Any opinion on this? } } The solution is to implement "tickless kernel". It is not that difficult. The other option would be to tell PHP not to be so dumb. What happens on other OSes? I find it hard to believe that we're the only ones that aren't tickless. One more reply in addition to my previous one. As I said, OSX for example, do use nanosleep. But I just checked a little more in NetBSD, and it would seem that the man-page for nanosleep actually claims that our nanosleep is in fact using the normal clock ticks as well, giving nanosleep a resolution of (normally) 10ms. Which seems absurd. Are we really giving that poor resolution to nanosleep calls? I would still say that this is not necessarily tied to a tickless implementation, even if I can see that there is a connection here. We should be able to deal with very high resolution clocks, even if we have a normal clock ticking at 100Hz. But it might be easiest if we were to move to a tickless implementation in general, in order to deal with arbitrary clock times. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol
Re: nanosleep() for shorted than schedule slice
On 2017-07-02 18:54, John Nemeth wrote: On Jul 2, 1:16pm, Christos Zoulas wrote: } In article <1n8j63y.1pcs0owrn6gcem%m...@netbsd.org>, } Emmanuel Dreyfus <m...@netbsd.org> wrote: } > } >I just encountered a situation where PHP performance on NetBSD is rather } >weak compared to Linux or MacOS X. } > } >The code calls PHP's uniqid() a lot of time. uniqid() creates an unique } >id based on the clock. In order to avoid giving the same value for two } >consecutive calls, PHP's uniqid() calls usleep(1) to skip to make sure } >the current microsecond has changed. } > } >On NetBSD this turns into a 16 ms sleep, which is 16000 what was } >requested. This happens because the kernel scheduled another process, } >which is the behavior documented in the man page. However the result is } >that a PHP script full of uniqid() is ridiculously slow. } > } >I worked around the problem by reimplementing PHP uniqid() using } >uuidgen(), but that kind of performance problem could exist in many } >other softwares. } > } >I wonder if it would make sense for nanosleep(2) to check that requested } >sleeping time is shorter than a schedule slice, and if it is, spin the } >CPU instead of scheduling another process. Any opinion on this? } } The solution is to implement "tickless kernel". It is not that difficult. The other option would be to tell PHP not to be so dumb. What happens on other OSes? I find it hard to believe that we're the only ones that aren't tickless. I don't get it. What was the problem with using nanosleep for short usleep's? Also, how does tickless mode help? This is all about the current implementation of usleep, which suspends the thread, and the time slice the scheduler uses between context switches. A tickless implementation will still split processing time into time slices between context switches, so you'd still get the same effect with a tickless kernel. You just wouldn't get the regular interrupt at each tick with tickless, since that is actually what tickless mode is about (unless I'm confused). With tickless mode, you instead keep track of time through other means than a regular clock interrupt, which might do nothing more than increment the time. And I fail to see how it would be a problem implementing usleep using nanosleep for short time values. In fact, OSX as an example, uses nanosleep for all usleep calls. You still have the caveat that suspension might be longer. The time argument actually just gives a lower bound. But OSX obviously uses nanosleep for this scenario. (Yes, I might totally have missed some earlier discussion on this topic, and might be totally bonkers and clueless. In which case, feel free to educate me.) Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: b...@softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol