Re: NVMM sync up from DragonFlyBSD
On Sat, 27 Jan 2024 at 11:28, Emile 'iMil' Heitor wrote: > > > I've synced up our NVMM code (kmod, lib and tool) with its current > state in DragonFlyBSD; if you're using NVMM on NetBSD as your > hypervisor you might want to give it a try > https://github.com/NetBSD/src/compare/trunk...NetBSDfr:NetBSD-src:nvmm > > I've also added a vmware compatible CPU frequency cpuid leaf which can > be used to get CPU frequency from the host instead of doing spending > 100ms in DELAY(). Qemu knows hot to expose it via the -cpu host,+invtsc > flag. > Looks like the patch applies almost cleanly to -10, just a small hand-patchable section at the top of both sys/dev/nvmm/x86/nvmm_x86_{svm,vmx}.c Going to give it a spin there :) Thanks! David
Re: PVH boot with qemu
On Wed, 6 Dec 2023 at 11:37, Emile `iMil' Heitor wrote: > > I got it working. > > NetBSD/amd64 kernel booting in PVH mode straight from qemu -kernel flag. > It now needs a lot of cleaninig as it's basically a PoC, but here's a > WIP patch if anyone's interested in hacking into it. > > https://imil.net/NetBSD/qemu-pvh.patch > > Let me rephrase: I *know* it is ugly at the moment. I *will* make it > clean, just wanted to share the joy ;) > > Cheers, *excellent* work! David
Re: kern.boottime drift after boot?
On Tue, 10 Oct 2023 at 18:07, Robert Elz wrote: > > Date:Tue, 10 Oct 2023 12:42:48 +0100 > From: David Brownlee > Message-ID: > > > | I have a system which records the output of "sysctl -n kern.boottime" > | as part of a dhcpcd-exit.hook to ensure some processing only occurs > | once per boot. > > Cron's @reboot might help with that, that's its purpose. See crontab(5) (More context) - It's used as a dhcpcd-exit.hook to ensure some services are enabled only after an interface has an IP address, so in this case cron @reboot would not fit. > | kern.boottime (KERN_BOOTTIME) > | A struct timespec structure is returned. This structure > contains > | the time that the system was booted. That time is defined > (for > | this purpose) to be the time at which the kernel first > started > | accumulating clock ticks. > > That's correct, the issue is that the kernel doesn't really know what the > time is, early in the boot sequence, it just takes a guess based either > upon the RTC if the system has one (those tend not to be very accurate), > or the last mod time of the root filesystem (much less accurate) otherwise. > > As Crystal said, as the system time is corrected, the kernel can form a > better idea of what the time actually was when the system booted, based upon > the corrections that are being made to the current time of day. > > kern.boottime always contains the time that the system believes it was > booted, as best it knows what that was. The man page section you qouted > above is correct, and doesn't need updating. The manpage is correct, but incomplete. On reading it without understanding the implementation, there is ambiguity as to whether kern.boottime will be constant for any given boot (unless I've missed something elsewhere in the page). Furthermore it has the unfortunate behaviour of 'usually' appearing to be constant, which leads to easy assumptions, based on lack of clarity. Particularly as it would be easy to have an implementation which did have a constant per boot value, which would be more useful in some ways and less in others (I'm not arguing at this point that we should switch to such an implementation) David
kern.boottime drift after boot?
I have a system which records the output of "sysctl -n kern.boottime" as part of a dhcpcd-exit.hook to ensure some processing only occurs once per boot. Except... it doesn't quite work as the value appears to not be constant for a given boot. I have one case where the value recorded by dhcpcd-exit.hook called during rc is 1696860044 but the value of sysctl -n kern.boottime is now 1696860043 sysctl(7) states: kern.boottime (KERN_BOOTTIME) A struct timespec structure is returned. This structure contains the time that the system was booted. That time is defined (for this purpose) to be the time at which the kernel first started accumulating clock ticks. I'm assuming it's calculated as some form of offset - maybe it would be better as an absolute value. If not, and it can drift about, then I'll at least update the manpage and look for a different mechanism for my 'once per boot' :) David
Re: GPT attributes in dkwedge [PATCH]
On Mon, 18 Sept 2023 at 18:21, Martin Husemann wrote: > > On Mon, Sep 18, 2023 at 06:14:58PM +0100, David Brownlee wrote: > > Specifically in the absence of any other information (empty devname? > > etc), would it not be reasonable to fall back to the bootme marked > > filesystem as a root filesystem candidate? I'm thinking about > > minimally configured disks moving between machines > > No, the bootme is usually on the EFI system partion, which is also > usually not set up as a root partition for NetBSD :-) Ah - that doesn't seem to match the FreeBSD usage of bootme - do we have something comparable to that? https://man.freebsd.org/cgi/man.cgi?query=gptboot=0=8=FreeBSD+13.2-RELEASE+and+Ports=default=html Our gpt(8) states "bootme flag is used to indicate which partition should be booted by UEFI boot code", which could be read either way.
Re: GPT attributes in dkwedge [PATCH]
On Sun, 17 Sept 2023 at 23:25, Robert Elz wrote: > [...] > > That is what you MUST NOT do, BOOTME has nothing whatever to do > with what is root. That's the part that must be done some other way. > > (The bit where the flag was copied into the wedge info was just a > layer violation, and easy to avoid, as your patch showed, that was > never the real issue.) > > Fortunately, it seems (as demonstrated by later discussion) that the > "other way" already exists, and none of this is needed at all. Apologies for a potentially dumb question. Specifically in the absence of any other information (empty devname? etc), would it not be reasonable to fall back to the bootme marked filesystem as a root filesystem candidate? I'm thinking about minimally configured disks moving between machines Thanks David
Re: [GSoC] Emulating missing Linux syscalls project questions
On Sun, 19 Mar 2023 at 04:22, Theodore Preduta wrote: > > > The Linux Test Project (http://linux-test-project.github.io) would help > > not only with finding missing syscalls, but also with finding bugs / > > missing functionality in the existing Linux emul code. > > Yes this is a great idea! Although my interpretation of the project > idea is that the expectations are that the binary is functional by the > end of the summer. I obviously will not be able to implement all > missing syscalls by the end of the summer, so I would have to draw an > arbitrary line as to what I would/would not try to implement. > > Which brings me to my next comment. > > > It would be nice to have this running on NetBSD. > > In what way exactly does the LTP not function on NetBSD? I tried it > today and (after a few hours of troubleshooting) seemingly got it to work. > > Some assorted notes about what I did/what it took to get it to work: > > - I only looked at system call tests (so for all I know the other types > of tests could be what you're referring to). > > - The actual testcases themselves can be trivially (just add -static) > statically compiled on any Linux distro and can be run individually just > fine, but the rest of the testing infrastructure cannot (because glibc). > (Most of my time spent on this was dealing with glibc versions) > > - Otherwise you can compile everything normally on OpenSUSE 15.4, and > with suse15_base installed the binaries will *almost* just work. > > - The ltp-pan binary does depend on /dev/kmsg (which doesn't currently > exist in the emul code), but only writes to it, so touch > /emul/linux/dev/kmsg is sufficient to trick it into working. > > - As expected, lots of tests fail, but also lots of tests pass! I > haven't looked to hard into the failing tests (yet), but I didn't find > anything too surprising in the list of failing tests. > > Overall, I did enjoy going down this rabbit hole! It definitely taught > me a few new things about how the emul subsystem behaves. I think making progress on running LTP on NetBSD (*), and fixing up a useful subset of missing or incomplete syscall implementations would make an _excellent_ GSoC project :) *: Even if its "Build all the test infrastructure on Linux, then run it on NetBSD under Linux emulation", or "Build and run it on Linux, with the individual tests run as a ssh to a NetBSD box" - as the more interesting part is what the tests show David
Re: Add five new escape sequences to wscons
On Mon, 16 Jan 2023 at 17:20, Valery Ushakov wrote: > > On Mon, Jan 16, 2023 at 09:18:53 -0300, Crystal Kolipe wrote: > > > It's useful, because these sequences correspond to the terminfo > > capabilities rin, indn, vpa, hpa, and cbt as defined in the xterm > > terminfo entry. With these sequences implemented, it becomes > > slightly more practical to set TERM=xterm when connecting to remote > > systems that don't have a comprehensive terminfo database. > > Why is is desirable to set specifically TERM=xterm instead of, say, > vt220, or whichever vt entry describes wscons the closest? > > For multi-line scroll the patch just calls scrollup/scrolldown, but > that's not what the single-line scroll commands do (see > wsemul_vt100.c) > > I'm actually not entirely convinced that it's even correct to describe > vt220 as having sf/ind scrolling capabilities, b/c the vt220 scrolling > sequences take the scrolling region into account and the terminfo > capabilities for scrolling are defined to operate on the whole screen > as far as I can tell. > > So in its current form I don't think this patch is suitable and I'm > not convinced it's needed at all. Technically the wscons terminal type is wsvt25, an extended ANSI compatible terminal, already supporting more sequences than vt100. Having it also support a useful subset of xterm, providing it doesn't add an excessive amount of complexity, seems like a useful addition, particularly if other systems also have a "wscons" with similar additional handling. Double checking some of the new capabilities may well be a good idea, plus noting in comments that they exactly match xterm behaviour, and a short note in the wscons manpage - I don't have enough ANSI/terminfo context to add anything directly useful on that point. Thanks David
Re: SCSI Polled io fixes for mac68k with PDMA enabled.
On Fri, 30 Dec 2022 at 01:08, Izumi Tsutsui wrote: > I don't know a history why mac68k has both MI ncr5380 (GENERICSBC) and > the own NCR5380 driver (GENERIC with dev/mac68k5380.c) for years. Some models only worked with dev/mac68k5380 and some only worked with MI ncr5380, but there was never an intersection of people able to look at it and affected machines. I'm tempted to say by now if we can confirm MI ncr5380 working on any useful subset of machines then it should be the one in GENERIC and provide a GENERICNCR for the alternate David
Re: SCSI Polled io fixes for mac68k with PDMA enabled.
On Thu, 22 Dec 2022 at 11:47, Nathanial Sloss wrote: > > Hi, > > I've found while working with mac68k that devices that require polled io scsi > transfers would fail with sbc pdma (pseudo dma) would fail when reading and > writing to the respective device. > > I've found that virtual devices for rascsi and the scsi2sd drives would fail > using pdma. For virtual disks they would fail always when writing to the > device, reads were ok. > > For virtual ethernet devices they would fail for reading and writing. > > To address this I've inctrouduced a flag for sbc.4 PDMA_NO_WRITE which would > fallback consistently to polled io when writing to the deivice. > > Also to check the current scsi transfer control flag XS_CTL_POLL and if set it > would not use PDMA for that particular transfer. > > Please see: > > ftp.netbsd.org/pub/NetBSD/misc/nat/sbc_poll_fix.diff > > Any objections? Might it be possible to detect this at runtime - potentially by trapping a timeout and downgrading to polled io? David
Re: Dell PERC H330: no disks, no volumes
On Thu, 15 Sept 2022 at 19:27, Brad Spencer wrote: > > In the foggy recesses of my memory this is Just How It Is Done. At my > final $DAYJOB we had a set of systems that had some PERC controller in > them. The desire was to present the raw disks to Hadoop and the only > way that could be done was to create a virtual disk for each physical > device. There was no other option available to us. I was annoyed enough by this behaviour to swap out the PERC on my old T320 for another model, specifically one for which I could find generic LSI firmware, so it would expose the 8 disks directly to NetBSD (for ZFS use) mpii0: SAS9217-8i, firmware 20.0.7.0, MPI 2.0 David
Re: Debugging/fixing a kernel stalled not crashing
Tangentially... If it's an issue picking up the root filesystem, you could boot an INSTALL type kernel with a built in ramdisk with dhcpcd and sshd enabled, and see if you can ssh into the box (I think someone had pre-built arm images which did just that, so the code should be out there :) David
Re: killed: out of swap
On Wed, 15 Jun 2022 at 08:31, Johnny Billquist wrote: > > On 2022-06-15 06:57, Michael van Elst wrote: > > b...@softjar.se (Johnny Billquist) writes: > > > >> I don't see any realistic way of doing anything with that. > >> It's basically the first process that tries to allocate another page > >> when there are no more. There are no other processes at that moment in > >> time that have the problem, so why should any of them be considered? > > > > They might be the reason for the memory shortage. You can prefer large > > processes as victims or protect system services to keep the system > > managable. > > So when one process tries to grow, you'd kill a process that currently > have no issues in running? Which means you might end up killing a lot of > non-problematic processes because of one runaway process? Seems to me to > not be a good decision. As opposed to the process which had a successful malloc some time ago and is running without issues, and is just about to try to use some of its existing allocation? Both options are wrong in some cases. Having a way to influence the order in which processes are chosen would seem to be the best way to end up with a better outcome. The existing behaviour should remain an option, but (at least for me) it would not be the one chosen David
Re: killed: out of swap
On Tue, 14 Jun 2022 at 13:33, Robert Elz wrote: > > NetBSD implements overcommitted swap - many processes malloc() > (or mmap() which that really becomes in the current implementation) > far more memory than they're ever going to actually use. It is only > when some real physical memory is required (rather than simply a marker > "zero filled page might be required here") that the system actually > allocates any real resources. Similarly pages mapped from a file only > need swap space if they're altered - otherwise the file serves as the > backing store for it. > > Once upon a time there was a method to turn overcommitted swap off, and > require actual allocations (of RAM or swap) to be made for all reserved > (virtual) memory. I used to enable that all the time - but I haven't seen > any mention of it in ages, and the mechanism might no longer still exist. What might be interesting is a way to influence the order in which processes are chosen to kill... David
Re: Slightly off topic, question about git
On Mon, 6 Jun 2022 at 06:59, Brian Buhrow wrote: > > Hello. At the risk of raising the debate about which version control > system we should > use, I have a question about git, as well as a comment about it relative to > the NetBSD source > tree. I should preface my comments with the caveat that I am not by any > means a git expert, > and, in fact, I'm barely able to get anything I want out of it. With that > said, here are my > questions and observations. I'd be interested to know how others work around > these issues > and/or what you think of my observations. > > 1. In CVS, I can do something like: > cvs log sys/dev/pci/if_bge.c > and be given a complete history of the changes to that file, as well as a > list of all the > branches that file participates in and which versions apply to each branch. > And, I can do this > without having to download all of the history of that file onto my local > storage. > It seems like the only way to do this with a git repository is to > download the entire > source tree, along with its history and branches, using git clone with an > infinite depth. Is > this correct? If not, how can I see all the branches of a given repository > without having to > download the entire repository? git inherently looks at the local copy of the repo. So your options are - have a local copy - ssh to somewhere with a local copy - use a web tool or similar to browse > 2. Also, in my exploration of git, it seems like the git log command shows > all the commits for > each tag, rather than the comments for a specific file or object in the > repository. Again, is > this correct? You can do either or both - "git log trunk" "git log build.sh" or "git log trunk build.sh" As an aside, I have an alias of gl -> "git log --name-status" as I really prefer to see the filenames changed in each commit > If I am correct in my guesses about how git works, it seems like I > would have to download > the entire history of the NetBSD source tree if I want to browse its > branches, or the commit > history for any given file. This is a lot of overhead to examine tiny > portions of the tree, > relatively speaking, assuming we move to git for our version control system. > It strikes me > that requiring this much storage space from developers, would be a regression > from what we > currently do. Since I think we're smarter than that and since we have very > smart people on our > development team, I want to understand what it is that I don't get about git > that precludes me > from having to download the entire history of the source tree from day one > while still > retaining access to that history over time. "It's a feature". Half :) - Seriously though, the ability to actually browse and search the full history of a source tree as git allows compared to the godawful eye-of-the-needle view that CVS provides is a very valuable benefit of the tradeoff of having a local history. When looking at source tree history I use a cloned copy of the github src, then apply to the CVS tree as needed. For people with limited resources it will be a pain, though there are any number of services which provide remote web access to git trees. Having said that, the ever increasing memory requirements of modern gcc is a much bigger pain for limited resources with a relatively smaller benefit. I suspect most of this also works with s/git/hg/ assuming NetBSD switches to a mercurial repo David
Re: High kernel time, page scan rate & reclaims?
On Sun, 5 Dec 2021 at 05:42, Paul Ripke wrote: > > For the archives, since I just got annoyed again by the behaviour (I'm > running netbsd-9), this was likely fixed in: > > PR kern/54209: NetBSD 8 large memory performance extremely low > PR kern/54210: NetBSD-8 processes presumably not exiting > PR kern/54727: writing a large file causes unreasonable system behaviour > > in -current, and will be in netbsd-10. Just curious, but would you be willing to test boot a current kernel (with every other file unchanged) to see if it does resolve everything for you? If you can reproduce it currently in single user mode even better as it narrows the test even further :) David
Re: [PATCH] Move DRM-driver firmware from base to its own set, gpufw
On Thu, 23 Sept 2021 at 17:57, Robert Swindells wrote: > > David Brownlee wrote: > > > >If gpu firmware is somewhat special, is there any sense in moving it > >to /usr/libdata/firmware/gpu/... ? > > No. > > It needs to be in /libdata so that it is guaranteed to be on the boot > filesystem. Apologies - read that as libdata/firmware -> libdata/firmware/gpu David
Re: [PATCH] Move DRM-driver firmware from base to its own set, gpufw
If gpu firmware is somewhat special, is there any sense in moving it to /usr/libdata/firmware/gpu/... ? David
Re: Some changes to autoconfiguration APIs
As an alternative to switching config_found() to a C99 init variant... Code could be added to a tool which processes the source (cough cough "lint") to scan config_found() calls and pick up semantically invalid parameter uses David
Re: Some changes to autoconfiguration APIs
On Sun, 1 Aug 2021 at 22:47, Jason Thorpe wrote: > > > On Aug 1, 2021, at 1:56 PM, Mouse wrote: > > > >>> config_found(CF_VERSION, self, whatever, (const struct cfargs *){ > >>> .search = ..., > >>> .locators = ..., > >>> }) > > > >> What do you propose should be the behavior if the versions don't match? I > >> h$ > > > > I thought the mail you replied to said, though admittedly partly by > > implication: > > > >>> config_found() needs to check passed cf_version and convert for old > >>> versions. We are still left with a long tail of conversion code in > >>> config_found(), but callers Just Work. > > Right, "callers Just Work" is carrying a lot of water here. I want to know > specifically how people think it should behave. For example: What should > happen in the case of a semantic conflict that can't be resolved during > conversion? > > (If you can't tell, I'm a bit annoyed about folks having plenty of energy to > express their distaste with one solution, only to float a hand-wavy > alternative lacking specifics that also has flaws; sorry, abs@, I'm not > trying to pick on you here...). Not at all - my goal was to propose a potential alternative, and poking at gaps helps evaluation :) As I see it: 1) netbsd-9 had an API which provided some degree of type safety, but was the result of accreting a baroque combinations of functions and parameters to the point where it was difficult to use correctly - and the tree had any number of examples which were actively wrong, and would fail at runtime, mostly with misbehaviour, but potentially with panics 2) current has an API which is much easier to understand and use, had a nice degree of forward compatibility, though introduces some potential misuse cases which can only be detected at runtime - as a deliberate tradeoff to achieve a simple, compat calling API given the limitations of C 3) This email takes one of Taylor's suggestions and hangs an explicit version on the calls, which should give reasonable forward compatibility (not as good as 2, but better than 1), keeps his improved type safety, to hopefully give a more limited set of cases which would fail at runtime (mis-specified cfargs contents, and cases where a valid cfargs_v1 cannot be converted into a current cfargs) Focussing on 2 & 3, the runtime issues are a) Tag params missing value params & similar (applicable to 2) b) Semantically valid options which do not make sense (applicable to 2 & 3) For both of these the kernel can panic, or fail the attach with a nasty loud message (which I rather prefer), but we have the same runtime issue to handle for both 2 & 3 c) Parameters which made sense for an earlier version of the kernel API, but do not now (applicable to 2 & 3) The obvious reply is "Don't do that", but if for some reason we have to, option 3 potentially has an advantage here, as for example the conversion code called by config_found() can know that the "search" value in cfargs_v1 needs to be swizzled differently to that of cfargs_v2 tl;dr - all options allow code to call into config with bad data, which it has to handle (panic or log & fail attach), we can only try to reduce, not eliminate that. (Let me know if I've reduced the hand waving in the right area :) David
Re: Some changes to autoconfiguration APIs
On Sun, 1 Aug 2021 at 21:50, Jason Thorpe wrote: > > > On Aug 1, 2021, at 12:48 PM, David Brownlee wrote: > > > > Possible thought to provide type safety with automatic versioning. > > > > Use C99 initializers with a CF_VERSION define. When cfargs changes we > > bump CF_VERSION. > > > > config_found() needs to check passed cf_version and convert for old > > versions. We are still left with a long tail of conversion code in > > config_found(), but callers Just Work. > > > > config_found(CF_VERSION, self, whatever, (const struct cfargs *){ > > .search = ..., > > .locators = ..., > > }) > > I would probably hide it in a macro (part of what I object to about this > method, which was floated before, is that it is needlessly verbose). > > What do you propose should be the behavior if the versions don't match? I > have an idea in mind, but I want to hear a concrete proposal first. Well, we're well into into perl TMTOWTDI territory here, but my first thought would be: - We start with CF_VERSION 1 and struct cfargs - when bumping from 1 to 2, copy the existing cfargs to cfargs_v1 then update, and add a convert_from_cfargs_v1 function - config_found() starts by checking if cf_version != CF_VERSION and calls convert_from_cfargs_v1 as needed - when bumping from 2 to 3, repeat with _v2, plus update convert_from_cfargs_v1, and add a new case to the start of config_found() David
Re: Some changes to autoconfiguration APIs
On Sun, 1 Aug 2021 at 15:57, Jason Thorpe wrote: > > > On Aug 1, 2021, at 5:15 AM, Martin Husemann wrote: > > > > On Mon, May 10, 2021 at 10:30:09PM -0700, Jason Thorpe wrote: > >> > >>> On May 10, 2021, at 7:58 PM, matthew green wrote: > >>> > >>> please, can we revert and re-do with a type-safe API. > >> > >> I don't plan to revert, but I will consider a betterly-typed API > >> that's not extremely cumbersome to use. I am not a fan of Taylor's > >> proposals. Concrete proposals welcome. > > > > Ping? > > > > A decision on this API needs to happen before the netbsd-10 branch > > (this is on the branch blocker list) - we need to either backout or move > > forward some way. > > The situation hasn’t changed. I’m still waiting for concrete proposals. > Possible thought to provide type safety with automatic versioning. Use C99 initializers with a CF_VERSION define. When cfargs changes we bump CF_VERSION. config_found() needs to check passed cf_version and convert for old versions. We are still left with a long tail of conversion code in config_found(), but callers Just Work. config_found(CF_VERSION, self, whatever, (const struct cfargs *){ .search = ..., .locators = ..., }) David
Re: 9.1: boot-time delay?
On Tue, 18 May 2021 at 20:02, Mouse wrote: > > I'm dealing with a turnkey product running under 9.1/amd64. On certain > hardware, there is a pause, almost exactly 22 seconds, during autoconf. > I'm trying to eliminate it. A sufficiently cut-down kernel does the > job, but another cut-down kernel doesn't. I'm trying to track down > what's responsible. (The kernel that eliminates the pause is used by > the installer; the one that doesn't is the one that's used in normal > operation. Unless it turns out to be something essential for > operation, I'd like to cut it out of the operational kernel.) [...] > [ 3.288539] uhub2: 4 ports with 4 removable, self powered > [ 3.288539] uhub3: 6 ports with 6 removable, self powered > [25.272567] wd0 at atabus0 drive 0 > [25.273568] wd0: I'd take a long hard look at what ata or atapi devices were configured in the kernel - smells like a timeout (though I would have expected 30 seconds...) Though that seems obvious enough to have already been checked :-p David
Re: one remaining mystery about the FreeBSD domU failure on NetBSD XEN3_DOM0
On Fri, 16 Apr 2021 at 08:41, Greg A. Woods wrote: > What else is different? What am I missing? What could be different in > NetBSD current that could cause a FreeBSD domU to (mis)behave this way? > Could the fault still be in the FreeBSD drivers -- I don't see how as > the same root problem caused corruption in both HVM and PVH domUs. Random data collection thoughts: - Can you reproduce it on tiny partitions (to speed up testing) - If you newfs, shutdown the DOMU, then copy off the data from the DOM0 does it pass FreeBSD fsck on a native boot - Alternatively if you newfs an image on a native FreeBSD box and copy to the DOM0 does the DOMU fsck fail - Potentially based on results above - does it still happen with a reboot between the newfs and fsck - Can you ktrace whichever of newfs or fsck to see exactly what its writing (tiny *tiny* filesystem for the win here :) David
Re: Bounties for xhci features: scatter-gather, suspend/resume
On Fri, 26 Mar 2021 at 09:22, nia wrote: > > On Thu, Mar 25, 2021 at 08:36:25PM +, co...@sdf.org wrote: > > Hi all, > > > > I'd like to offer bounties for the following. > > I am also utilizing the wiki to make it easy for others to add their own > > bounties: http://wiki.netbsd.org/projects/funded/ > > > > > > > > xHCI resume support > > > > xhci is everywhere, and for many machines, it's the only remaining step > > for a flawless suspend/resume experiences. > > xhci_{suspend,resume} are unimplemented, and devices do not work after > > resume. > > > > (Contact nia in http://gnats.netbsd.org/56050 for actual hardware testing) > > > > I can offer a bounty of $200 for this. > > Offer valid until 1/July/2021. > > Offering another $100 for this, with the "win condition" being > working suspend on a lenovo x250. A regression with resume on > this machine may have been introduced in -current, it can resume > successfully 100% of the time with -9. I can add another $200 for working suspend/resume on a Thinkpad T480 (-current resume does not complete - it prints a few "WARNING: TSC time went backwards by 2650670841" type lines, not sure of -9 state). David
Re: X vs serial console?
On Tue, 9 Feb 2021 at 17:59, Mouse wrote: > > I don't know whether this is kernel or X11. There are things pointing > each way. > > At work, I've got 9.1 on an amd64 machine. When I boot it normally - > console on screen/keyboard - X works fine. > > But I'm having an issue. The machine is rebooting on me, sometimes, > and I don't know whether it's some kind of quasi-spontaneous hard-reset > or whether it's a panic. But, with X on the console, I can't tell > whether there's a panic or not. > > So, I booted it with serial console. But now, X doesn't seem to work. > There are a number of curious things involved. I think NetBSD would really benefit from a way to reparent the console device at runtime (I appreciate this comment does not directly help in any way at this point :) AFAIK X requires a wsdisplay to run on - which you don't seem to get with a serial console. I wonder if it might be possible to run it on genfb? Those dmesg outputs are _so_ different, that something seems very much off - can you get dmesg.boot from both cases? David
Re: zfs panic in zfs:vdev_disk_open.part.4
On Sat, 28 Nov 2020 at 19:50, Yorick Hardy wrote: > > Dear Juergen, > > Of course! I had a slight disaster with my CVS checkout, I will commit > and request a pullup it once I have completed a new checkout. Can confirm change and pullup fix the issue I was seeing - many thanks! # uname -v NetBSD 9.1_STABLE (GENERIC) #0: Sun Nov 29 11:41:49 UTC 2020 mkre...@mkrepro.netbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC # zpool list NAMESIZE ALLOC FREE EXPANDSZ FRAGCAP DEDUP HEALTH ALTROOT iris0 7.25T 67.9G 7.18T - 0% 0% 1.00x ONLINE - :) David
zfs panic in zfs:vdev_disk_open.part.4
I'm seeing a (new?) panic on netbsd-9 with zfs. It seems to trigger when a newly created zfs pool attempts to be mounted: panic: vrelel: bad ref count cpu0: Begin traceback... vpanic() at netbsd:vpanic+0x160 vcache_reclaim() at netbsd:vcache_reclaim vrelel() at netbsd:vrelel+0x22e vdev_disk_open.part.4() at zfs:vdev_disk_open.part.4+0x44e vdev_open() at zfs:vdev_open+0x9e vdev_open_children() at zfs:vdev_open_children+0x39 vdev_root_open() at zfs:vdev_root_open+0x33 vdev_open() at zfs:vdev_open+0x9e vdev_create() at zfs:vdev_create+0x1b spa_create() at zfs:spa_create+0x28c zfs_ioc_pool_create() at zfs:zfs_ioc_pool_create+0x19b zfsdev_ioctl() at zfs:zfsdev_ioctl+0x265 nb_zfsdev_ioctl() at zfs:nb_zfsdev_ioctl+0x38 VOP_IOCTL() at netbsd:VOP_IOCTL+0x54 vn_ioctl() at netbsd:vn_ioctl+0xa5 sys_ioctl() at netbsd:sys_ioctl+0x5ab syscall() at netbsd:syscall+0x157 --- syscall (number 54) --- 7e047af6822a: cpu0: End traceback... Anyone seeing anything similar (I continue to have a bunch of other boxes which use zfs without issue) David
Re: Sample boot.cfg for upgraded systems (rndseed & friends)
On Tue, 22 Sep 2020 at 18:02, Jonathan A. Kollasch wrote: > > On Tue, Sep 22, 2020 at 05:53:49PM +0100, David Brownlee wrote: > > Should NetBSD be shipping a default boot.cfg in /usr/share/examples > > (*) - thinking primarily of people who have upgraded from earlier > > NetBSD versions. > > > > I was looking to add in rndseed & just generally sync with the latest > > version but there doesn't seem to be a default example shipped with > > the system > > > > /boot.cfg is already shipped as part of the 'etc' set, and is handled by > etcupdate(8) like any other configuration file. Ah, thanks, excellent... then I have a different question :-p What would people think of installing an original copy of the etc set in /usr/share/examples/etc or similar - its 4.9M extracted and ~500K compressed and the ability to compare what is on the system to what it was shipped with would have saved me so much effort over the years :) David
Sample boot.cfg for upgraded systems (rndseed & friends)
Should NetBSD be shipping a default boot.cfg in /usr/share/examples (*) - thinking primarily of people who have upgraded from earlier NetBSD versions. I was looking to add in rndseed & just generally sync with the latest version but there doesn't seem to be a default example shipped with the system *: Or /usr/mdec, or /etc/... ? David
Re: Logging a kernel message when blocking on entropy
On Tue, 22 Sep 2020 at 12:35, Manuel Bouyer wrote: > > On Tue, Sep 22, 2020 at 02:31:50PM +0300, Andreas Gustafsson wrote: > > Manuel Bouyer wrote: > > > I'm not sure we want a user-triggerable kernel printf enabled by default. > > > This could be used to DOS the system (especially on serial consoles) > > > > You can already trigger kernel printfs as an unprivileged user. > > The first one that comes to mind is "sorry, pid %d was killed: > > orphaned traced process", but I'm sure there are many others. > > I think we should find and remove theses (or make them conditional) > instead of adding unconditional new ones Maybe a standard way of rate limiting such messages, including indicating how many were skipped due to rate limiting when the next one gets printed? David
Re: fsck updating but not fixing filesystem
On Mon, 24 Aug 2020 at 09:04, David Brownlee wrote: > > On Sun, 23 Aug 2020 at 20:50, David Holland wrote: > > > > On Sun, Aug 23, 2020 at 08:14:31PM +0100, David Brownlee wrote: > > > > > > This time I've run fsck -f repeatedly and each time it marks the > > > filesystem as clean, but the next run finds another issue. > > > > > > This is netbsd-9 amd64 stable from nyftp, DELL, PERC H710P controller, > > > running RAID1. > > > > Are you sure the raid is clean? If it's not you can get bizarre > > behavior like this depending on which side of it any given read is > > serviced from. (That is: any given fsck run will see some of one > > version and some of the other and make some changes, which may or may > > not be consistent with what it sees the next time, and it all might > > converge or might not...) Hardware raid for the win... Or in this case not. Taking a block copy of the filesystem to another device and it comes up clean on fsck. I'm... a little annoyed at what purported to be a relatively nice Dell PERC raid card - battery backup an' all. Thanks David - I should have known better to trust hardware... Now I just need to work out the best way to get to a trustworthy system :) David
Re: fsck updating but not fixing filesystem
On Mon, 24 Aug 2020 at 11:46, Mouse wrote: > > > I think the general consensus is that ffs can be inconsistent it ways > > fsck is unable to detect. > > ...much less fix. Yes. When I was doing the program that eventually > got massaged into resize_ffs, during development I had some filesystems > that were definitely corrupted but that fsck was happy with. (I rather > wish I'd saved some of them as test cases, but I didn't.) Sounds like there is an in interesting fuzzing project in there for someone - make a filesystem mage and the repeatedly damage it, then see if fsck can fix it, then if you get a rump panic when moving everything around, and then re-run fsck to see if it indicates any new issues :) (So far 3.5TB of my original RAID1 filesystem transferred to a plain disk, so should be able to run some A/B fsck tests later today to establish if the raid controller is the issue in this case) David David
Re: fsck updating but not fixing filesystem
On Sun, 23 Aug 2020 at 20:50, David Holland wrote: > > On Sun, Aug 23, 2020 at 08:14:31PM +0100, David Brownlee wrote: > > > > This time I've run fsck -f repeatedly and each time it marks the > > filesystem as clean, but the next run finds another issue. > > > > This is netbsd-9 amd64 stable from nyftp, DELL, PERC H710P controller, > > running RAID1. > > Are you sure the raid is clean? If it's not you can get bizarre > behavior like this depending on which side of it any given read is > serviced from. (That is: any given fsck run will see some of one > version and some of the other and make some changes, which may or may > not be consistent with what it sees the next time, and it all might > converge or might not...) No problems are indicated by envstat for mfii, or in the BIOS setup interface (Careful phrasing there). However, I have a spare 8TB disk I can attach to the onboard ahcisata, dd the filesystem across and re-run the fsck to confirm. (I may be a little while in following up with that result :) On Sun, 23 Aug 2020 at 21:26, Michael Cheponis wrote: >[...] > Then I was wondering: given today's disks are mostly lying to the software > about how its (internally) configured --- is there a 'better' FFS > (FFSv3 ?) that would better map to today's disks? Might there be a better > FFSvN for SSDs vs big HDs? Or just wait till ZFS is up to snuff? I would seriously consider ZFS - I have a couple of other boxes running ZFS, but this particular one panics if any zpool is mounted in multiuser (kern/55602) David
fsck updating but not fixing filesystem
I have a reasonably large ffs filesystem (7.4GB, 35,459,874 files) used as a dirvish backup target (dirvish creates a hardlink tree copy of the previous backup, and then runs rsync over it to provide relatively space efficient backups). One of the rsync processes hung, and upon reboot fsck checked the filesystem and marked it clean, but after a while it happened again, and then again a third time. This time I've run fsck -f repeatedly and each time it marks the filesystem as clean, but the next run finds another issue. This is netbsd-9 amd64 stable from nyftp, DELL, PERC H710P controller, running RAID1. filesystem was mounted -o log, which could have contributed to getting into this state, but presumably fsck should be able to get it out? (Waves hands and mumbles "triple indirect blocks") Each fsck run takes a little over 2 hours to complete (hence the desire to run with -o log) A sample is below. ** /dev/rdk5 ** File system is already clean ** Last Mounted on /home/media ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames DIRECTORY CORRUPTED I=112567242 OWNER=1000 MODE=40775 SIZE=1536 MTIME=Jun 8 17:11 2020 DIR=? SALVAGE? yes MISSING '.' I=112567242 OWNER=1000 MODE=40775 SIZE=1536 MTIME=Jun 8 17:11 2020 DIR=? FIX? yes MISSING '..' I=112567242 OWNER=1000 MODE=40775 SIZE=1536 MTIME=Jun 8 17:11 2020 DIR=/.backup/server1/20200628/tree/opt/server/backup/source/e7/0154904991e7bc764e08dbcd93b5/8c FIX? yes ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts LINK COUNT FILE I=67564638 OWNER=1000 MODE=100664 SIZE=14190 MTIME=May 13 03:14 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564639 OWNER=1000 MODE=100664 SIZE=45384 MTIME=May 13 03:19 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564640 OWNER=1000 MODE=100664 SIZE=52785 MTIME=May 13 03:18 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564641 OWNER=1000 MODE=100664 SIZE=56018 MTIME=May 13 03:24 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564642 OWNER=1000 MODE=100664 SIZE=34840 MTIME=May 13 03:34 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564643 OWNER=1000 MODE=100664 SIZE=87961 MTIME=May 13 03:31 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564644 OWNER=1000 MODE=100664 SIZE=24847 MTIME=May 13 03:42 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564645 OWNER=1000 MODE=100664 SIZE=43803 MTIME=May 13 03:44 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564646 OWNER=1000 MODE=100664 SIZE=55538 MTIME=May 13 03:50 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564647 OWNER=1000 MODE=100664 SIZE=64131 MTIME=May 13 04:05 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564648 OWNER=1000 MODE=100664 SIZE=32730 MTIME=May 13 04:00 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564649 OWNER=1000 MODE=100664 SIZE=35156 MTIME=May 13 04:50 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564650 OWNER=1000 MODE=100664 SIZE=91008 MTIME=May 13 05:04 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=67564651 OWNER=1000 MODE=100664 SIZE=15127 MTIME=Jun 8 17:11 2020 COUNT 10 SHOULD BE 9 ADJUST? yes LINK COUNT FILE I=103736490 OWNER=1000 MODE=100664 SIZE=12134 MTIME=Mar 17 01:08 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736491 OWNER=1000 MODE=100664 SIZE=12007 MTIME=Mar 17 01:08 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736492 OWNER=1000 MODE=100664 SIZE=13711 MTIME=Mar 17 01:13 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736493 OWNER=1000 MODE=100664 SIZE=5313 MTIME=Mar 17 01:14 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736494 OWNER=1000 MODE=100664 SIZE=9659 MTIME=Mar 17 01:14 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736495 OWNER=1000 MODE=100664 SIZE=32231 MTIME=Mar 17 01:19 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736496 OWNER=1000 MODE=100664 SIZE=50302 MTIME=Mar 17 01:19 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736497 OWNER=1000 MODE=100664 SIZE=56209 MTIME=Mar 17 01:20 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736498 OWNER=1000 MODE=100664 SIZE=18932 MTIME=Mar 17 01:20 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736499 OWNER=1000 MODE=100664 SIZE=47033 MTIME=Mar 17 01:21 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736500 OWNER=1000 MODE=100664 SIZE=20355 MTIME=Mar 17 01:21 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736501 OWNER=1000 MODE=100664 SIZE=5218 MTIME=Mar 17 01:22 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736502 OWNER=1000 MODE=100664 SIZE=12071 MTIME=Mar 17 01:24 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736503 OWNER=1000 MODE=100664 SIZE=51133 MTIME=Mar 17 01:25 2020 COUNT 12 SHOULD BE 11 ADJUST? yes LINK COUNT FILE I=103736504
Re: modules item #14 revisited
Very much like this - would assume that modules.tgz goes away? Could logical extensions to this be: a) Allow including a miniroot as a separate file b) Use ustarfs to allow handling this layout of kernel, modules and/or miniroot as a (optionally compressed) tar file Thanks David
Re: Adding an ioctl to check for disklabel existence
While I agree NetBSD needs to support and work well with GPT in order to interoperate with other systems, there is also prior art in extending disklabel to 64bits - OpenBSD did this back in 2007 (though there were a fair few follow up commits to cleanup the fallout :) https://github.com/openbsd/src/commit/ddfcbf38c8ab6225a6b172d829aa957007d2587f#diff-192d23728acf9d8a70ab7259784d4162 David
Linux emulation epoll support?
Is anyone working on epoll() or inotify support for compat_linux? Most recent Linux binaries seem to expect epoll() to be available. I noticed that there was some work towards it in FreeBSD... https://wiki.freebsd.org/linux-kernel Thanks :) David
Re: Proposal: new audio framework
On Tue, 2 Apr 2019 at 08:11, Tetsuya Isaki wrote: > Here is details: > > On -current, as you know, blocksize are decided as follows: > 1. audio layer selects some size and ask it to hardawre driver. > This is round_blocksize interface. > 2. if hardware driver cannot accept the size (for example, DMA > restrictions), hardware driver returns desired new size. > 3. audio layer accepts it unconditionally. > > Due to Step3's behavior and rumors (or obsoleted restriction?) that > block size must be a power of two, many drivers return the different > size even if proposed size is acceptable. > > AUDIO2 internal takes a block-oriented strategy, not a bytestream- > oriented, for performance and simplicity. > So, AUDIO2 changed it as follows: > a1. audio layer calculates suitable blocksize from its hardware > precision(stride), channels, frequency and blk_ms (= block > length in msec) parameters. > a2. and then ask it to hardware driver. It's round_blocksize. > a3. But if the hardware driver returns the other size, audio layer > cannot accept it because proposed size was calculated from > hardware encoding. > At the moment, I have no good idea for this case. :( If the various hardware restrictions are simple enough that they could be encoded as a small struct - eg boolean power_of_two; unsigned int min_size; unsigned int max_size; then would it be reasonable for AUDIO2 to request the restrictions from the driver and adjust blk_ms or other parameters until it finds a fit?
Re: setting DDB_COMMANDONENTER="bt" by default
On 15 February 2018 at 17:51, Manuel Bouyerwrote: > On Thu, Feb 15, 2018 at 01:19:31AM +, Sevan Janiyan wrote: > > > > > > > On 15 Feb 2018, at 01:09, Paul Goyette wrote: > > > > > > Sounds like a good case for a custom kernel. Not sure that such a > > > specific situation would warrant turning this on for everyone... > > > > We do have this set by default on some config files albeit with > differing commands to run e.g xen kernels. > > The problem with setting it by default is that the important information > (the panic message, or the function where the fault happended) may > be scrolled out of the screen by the stack trace. So I wouldn't > recommenend activating it by default. > the Xen kernels are a special case, becasue the console output > happens in an environnement where it's easy to scroll back. > Is there some useful variant where the panic message is shown again at the end of the stack trace, or the stack trace defaults to a very small number of entries by default? David
Re: Proposal: Disable autoload of compat_xyz modules
On 3 August 2017 at 12:11, Maxime Villardwrote: > Le 03/08/2017 à 10:42, matthew green a écrit : > >> Otherwise it has to be balanced. >>> >>> Certainly. It does not seem to me that moving compat_linux* into modules >>> is in >>> any way illegitimate or unbalanced. That's the opinion I was stating. >>> >> >> if you want to move useful and used by a large number of users >> functionality out of GENERIC and into modules then first perhaps >> you should consider fixing modules. >> >> there are a large number of basic functionality issues that no >> one pushing modules has solved yet. for a start, see lukem's >> original proposal about having a kernel+modules container, >> the functionality of which is a _essential_ before it's going >> to be considered OK to remove standard functionality from >> GENERIC. >> > > If your argument now is that there are technical difficulties that make > switching to a module approach a complicated business, beyond the > simplistic "I > don't want to type modload" stuff - which I don't agree with -, then > that's a > fair point. > > As I said, doing this work certainly involves, among others, finding a way > to > remove the many #ifdefs spread across the tree; and having tried to do so > two > years ago, I know it is a painful work. > > claiming that compat_linux isn't a major piece of usability >> is simply ignoring reality. >> > > I have never claimed it is not used. It is an important feature, but it > also > happens to have many places that need special care, which regularly turn > out to > be exploitable. If we can reduce the attack surface and at the same time > keep > the feature nearby, in a balanced way that does not impose too much burden > on > the regular users, then we should do it. But that's indeed ignoring the > technical difficulties behind achieving this goal. > How about a sysctl to enable/disable any non netbsd_ compat usage. With it off compat code in GENERIC will not be run and (non netbsd32 etc) compat modules not loaded. David
Re: DISKLABEL_EI option for system with MBR
On 12 February 2017 at 11:57, Rin Okuyamawrote: > Michael, Martin, thank you for letting me know about wedge(4). > It is exactly what I need! It is more portable than my patch. > I withdraw the patch and the PR. I think that DISKLABEL_EI would still be a good idea - as it would make other endian disklabels Just Work for people (including easy fstab usage)
Re: /dev/sdN -> /dev/sdN[cd] (was: port-amd64/51216: Can't create wedges on a large (3TB) disk, gpt is ok but dkctl gives an error message)
On 7 June 2016 at 10:00, Robert Elzwrote: > Date:Mon, 6 Jun 2016 18:35:43 +0200 > From:Edgar =?iso-8859-1?B?RnXf?= > Message-ID: <20160606163542.gr5...@trav.math.uni-bonn.de> > > | > ie /dev/wd1 is a link to /dev/wd1d on i386 (etc) or /dev/wd1c (on sparc > etc) > | YES. > > I offer attached alternate patches, the first makes /dev/wd0 as a chrdev > and the second as a link. > > I do not have all the various architectures that have the various different > strategies for naming and minor-numbering disk devices to test this thoroughly > though, but what I have tested seems to work, and the changes (both versions) > are so simple they seem unlikely to fail (and if they do, the effect would > just be that the new nodes would not be correct, all the ones we're used to > having would be fine, so simply removing the bogus ones would return the > universe to its current state.) > > I prefer the chrdev version ... it is robust against removal of the ?dNx > node names, which (sometime later, after tools/scripts have been adapted > not to seek out the ?dN[cd] device names explicitly) might be something to > do on a system using GPT and wedges (or even disklabel wedge autodiscovery). > It also will provoke any lingering bugs if anything is currently relying on > vnode locking for device exclusivity (with two different vnodes for the same > underlying device).But either version should work (only one of them > of course!) > > Either version consumes 2 more names, and inodes, per disk device configured. > > Opinions? Also would prefer the chrdev version. We probably want to ensure these are added to install media as well (which may push some of them over a current inode limit but that is much less of a tweak than the ongoing kernel growth :)
Re: Locking strategy for device deletion (also see PR kern/48536)
On 7 June 2016 at 11:28, Paul Goyettewrote: > Can anyone suggest a reliable way to ensure that a device-driver module can > be _really_ safely detached? > > The module could theoretically maintain an open/ref counter, but making this > MP-safe is "difficult"! Even if the module were to provide a mutex to > control increment/decrement of it's counter, there's still a problem: > > Thread 1 initiates a module-unload, which takes the mutex > > Thread 2 attempts to open the device (or one of its units), attempts to > grab the mutex, and waits > > Back in thread 1, the driver's module unload code determines that it is safe > to unload (no current activites queued, no current opens), so it > goes forward and unmaps the module - including the mutex! > > If the unload code releases the mutex, then thread 2 resumes, at an address > which has been unmapped, leading to all sorts of bad-stuff(tm). > (And, if the unload code doesn't bother to release the mutex before > destroying it, then thread 2 stalls indefinitely.) > > There currently doesn't seem to be a safe way to unload driver modules. > > > Any good MP-safe suggestions? Other than having the mutex be for a nullable pointer to the device which persists after driver detach and is reattached when the driver reattaches, which adds an extra pointer dereference for every use... :/
Re: RAIDframe raidN device order
On 20 April 2016 at 10:22, Edgar Fußwrote: > When I configure my RAIDframe devcices using raidN.conf, I may run into the > problem that after a reboot, the MPT controller may have assigned new pseudo > SCSI Target ID to my SAS discs, so they get different sdN numbers and the > array may fail to configure. > The solution seems to be to use auto-configuration for the arrays. But then, > how do I know which raidN device is which array? > Is there any way to use UUIDs to solve the problem? tl;dr - the Right Thing should Just Happen autoconfig will try to persist the raidN number, so once you have them setup you should be able to renumber any and all devices which provide raid partitions and have everything still work with the same raidN numbers (providing you can still boot the kernel if you are booting from raid :) The only time the raidN number will change is if you have two autoconfig devices with the same number (eg: when adding an already setup raid to an machine which overlaps with an existing raid)
Re: Simplify bridge(4)
On 15 February 2016 at 04:01, Ryota Ozakiwrote: > On Sat, Feb 13, 2016 at 7:19 AM, Mouse wrote: >> Sounds to me as though the most sensible way to model that would be to >> give the address to the bridge interface itself. >> >> I don't think I've tried that. If it does not work, is there any >> particular reason to add vether(4) rather than making it work? If it >> does work, what functionality would vether(4) provide over it? > > It's a design choice. FreeBSD adopts extending bridge(4) to assign > IP addresses and OpenBSD adopts vether(4). Both work and neither > is wrong. > > I prefer vether's approach because it keeps bridge(4) simple still > providing the same functionality of extending bridge itself. I think NetBSD supporting vether would also fix a couple of (at least interesting to some :) related use cases. a) Single interface machine running xen which needs the xen VMs on an internal network with dhcp and VPN/NAT on the external interface (this becomes quickly brain twisting and the solution is to plug in an additional ethernet card, just to act as the bridge endpoint) b) Running an emulator (which expects to tap onto an ethernet interface) on a machine with only a wifi interface
Re: i386 vs radeondrmkms problem - isa attachments suck
On 28 February 2015 at 09:44, matthew green m...@eterna.com.au wrote: hi folks. i've been trying to find a least-ugly solution to the radeondrmkms on i386 problem. quick summary of what's wrong: radeondrmkms doesn't complete attachments (and most importantly create a wsdisplay) until mountroot completes. this means it happens quite late in boot. in i386 GENERIC, vga@isa and pcdisplay@isa are still enabled and they will attach to the legacy vga device, and present a wsdisplay0 to the system. later, radeon0 attaches, and we get a wsdisplay1 that has taken over the console output. this leaves us with a non-working console output, and the inability to run X11 even if accessed remotely. my first attempt (that is currently commited), made the radeondrmkms driver attempt to map the isa vga registers to reserve them from the vga@isa, and while that worked on my serial console machine, it does not work on a normal system due to x86 consinit() attaching the basic vga console driver (so we get early console output.) in this case, it has already mapped these registers (ie, radeon is unable to map them) and the later real attachment knows not to attempt it again. so that method doesn't work. we could have the vga driver detach itself at the right point, but that leaves the console detached for quite a while, during the time that drm is getting setup (ie, we'd miss several of its early messages.) that seems less than desireable. it was suggested having a fake driver to attach instead of vga and thus avoiding the second phase of vga attachment, however this does not work due to the way isa indirect attachment works. the first match routine that returns non-zero is attached, and the order of routines called seems to be something config(1) generates. so having a radeon@isa that returns a history priority does nothing if the ordering is bad. this means that the current expectation of eg, the vga@isa vs pcdisplay@isa drivers (where vga returns a higher match) is not used, it just happens that the cfdata[] array has the vga@isa entry before pcdisplay@isa. Is the ignoring of attach priority a general characteristic of indirect buses, and might it make sense for config to be able to explicitly prioritise the order the cfdata[] entries? I know uebayasi@ has been rototilling config and wondered if he could be interested... :) this is not a problem for the old drm code, as it does not create wsdisplay itself, but relies on the vga driver to do so. (see isa.c:isasearch() config_match*() call for where the first match to return non-zero is used.) any one have any other ideas? at this point to make DRMKMS work for i386 on -7, i think we may have to createa a LEGACY kernel that has the vga|pcdisplay@isa drivers (and probably no drm at all?), and turn these devices off in GENERIC itself, but perhaps someone has a less ugly idea.
Re: posix message queues and multiple receivers
On 3 December 2013 22:45, David Laight da...@l8s.co.uk wrote: On Tue, Nov 26, 2013 at 01:32:44PM -0500, Mouse wrote: When serving a request takes nontrivial time, and multiple requests can usefully be in progress at once, it is useful - it typically improves performance - to have multiple workers serving requests. NFS, as mentioned above, is a fairly good example (in these respects). Except that NFS is a bad example, and mostly should have a single server. If you could arrange a NFS server for each disk spindle you might win. But what tends to happen is that the disk 'elevator' algorithm makes one of the server process wait ages for its disk access to complete, by which time the client has timed out and resubmitted the RPC request. The effect is that a slightly overloaded NFS server hits a catastrophic overload and transfer rates become almost zero. Run a single nfsd and it all works much better. On that basis should the NetBSD default be changed from -n 4?
Re: high load, no bottleneck
http://www.math.uni-bonn.de/people/ef/dotcache/ has a typo in the first subheading Dotache :) On 24 September 2013 13:38, Edgar Fuß e...@math.uni-bonn.de wrote: We want fsync to do a disk sync, and client are unlikely to be fixable. In my case, the culprit was SQLite used by browsers and dropbox. As these were not fixable, I ended up writing a system that re-directs these SQLite files to local storage (http://www.math.uni-bonn.de/people/ef/dotcache). RMW? Read-Modify-Write. On a RAID 4/5, writing anything that's not an entire stripe needs either to read the rest of the stripe (to be able to compute the new parity) before writing the modified part and the parity; or it (if you modify less than half the stripe) reads both the old data and old parity to compute the new parity. You don't have that on RAID 1, of course.
Re: high load, no bottleneck
crap, apologies for the non checked return address. In the interest of trying to make a relevant reply - doesn't nfs3 support differing COMMIT sync levels which could be leveraged for this? (assuming your server is stable :) aside I recall using NFS for file storage at Dreamworks in the late '90s and discovering the reason that the SGI file servers boxes outperformed everything else is that they lied to the client and indicated data has been synced to disk as soon as it hit memory. Wonderful performance feature... until someone insisted in putting known buggy ATM drivers into production which could give up to a GB of lost data when the fileservers paniced... /aside
Re: divergence of ffs flags
On 3 September 2013 03:04, David Holland dholland-t...@netbsd.org wrote: It seems that FreeBSD's and NetBSD's ffs superblock flags have been allowed to diverge: [...] -#define FS_SUJ 0x008 /* Filesystem using softupdate journal */ +#define FS_INDEXDIRS 0x008 /* kernel supports indexed directories */ [...] -#define FS_NFS4ACLS0x100 /* file system has NFSv4 ACLs enabled */ -#define FS_INDEXDIRS 0x200 /* kernel supports indexed directories */ -#define FS_TRIM0x400 /* issue BIO_DELETE for deleted blocks */ +#define FS_DOWAPBL 0x100 /* Write ahead physical block logging */ +#define FS_DOQUOTA20x200 /* in-filesystem quotas */ What are the options? I assume we can version something in the superblock so new NetBSD FreeBSD code could resolve the overlaps but that doesn't help old code... Pick new conflicting flags for the overlaps, ask FreeBSD to reserve them, add code to support both versions to all branches, and then in a release or so migrate across to them?
Re: netbsd32 emulation in driver open() or read()
On 30 August 2011 16:05, Manuel Bouyer bou...@antioche.eu.org wrote: On Tue, Aug 30, 2011 at 10:19:20AM -0400, Christos Zoulas wrote: On Aug 30, 3:18pm, bou...@antioche.eu.org (Manuel Bouyer) wrote: -- Subject: Re: netbsd32 emulation in driver open() or read() | Yes, look at PK_32 in the process flags. If you are going to do this, please | look at what FreeBSD did with bpf_ts/bpf_xhdr and the time format changes | and do the same (provide timespec/bintime etc). This is how they handle | compatibility mode too. | | This is related to the BIOCSTSTAMP ioctl isn't it ? I can see how it's used | in kernel but I couldn't find it in userland. So, to me it looks like | the old bpf_hdr is used most of the time ... | I'm not sure if it's worth implementing BIOCSTSTAMP (and we have to assure | compat for bpf_hdr anyway) Might as well bite the bullet and do the whole thing because with 10Gb+ ethernet what we have now just does not cut it. This is not only the BIOCSTSTAMP that we need then, but also the zero-copy stuff, and probably more. And userland tools to use it (because AFAIK freebsd's tcpdump still uses the old bpf_hdr ...) That may be nice to have, but won't help with my problem which is getting a N32 mips binary to talk to a N64 kernel. If the structure was versioned to have 64 bit fixed sized timestamps, then the problem goes away for new code, though it does leave a COMPAT50 issue for older code...
Re: The default system module area path
On 10 August 2011 10:53, Marc Balmer mbal...@netbsd.org wrote: Currently, we install kernel modules under the following path /stand/arch/release/name/name.kmod The duplication of the name probabably was meant to prevent escaping the path when a module name like ../../../foo was given on the commandline. I recently changed the module loading behaviour so that a module that is loaded from the default system module area must not, and can not, contain a path separator character. Therefore I suggest that we install modules into /stand/arch/release/name.kmod Seems excessively sane... (+1)
Re: Catweasel driver
On 6 February 2010 13:33, Frank Wille fr...@phoenix.owl.de wrote: Joerg Sonnenberger wrote: Can't you use the approach e.g. of the wpi(4) driver and load the firmware image from the filesystem? No, firmload(9) would not really be an option, because I want to detect connected devices, like a keyboard and floppy disk drives during boot. Without the firmware this would be impossible. firmload can currently load from a set of directories. Has anyone considered extending firmload to optionally load from memory as well - possibly an included ramdisk image? That would allow the choice of building in firmware images which could be loaded at boot and then the memory released. Plus it keeps a single consistent API. Just a thought :)
Hang with heavy build on cgd and MP
I have a largish java app which hang my netbsd-5 amd64 X60s when building on cgd. It normally happens on the second consecutive build, using the native openjdk7. I originally noticed in on cgd on dk, but removing DKWEDGE_METHOD_BSDLABEL didn't affect matters. Moving the tree from the cgd partition on to a normal one (on the same disk) avoids the issue, as does using cpuctl to take one of the two cpu core offline. It hangs hard - I can't drop into ddb. This is happening under sources from a day or so ago, and from early January. Does anyone have any thoughts? Thanks
Re: check reprogram PCI BAR
2010/1/19 Manuel Bouyer bou...@antioche.eu.org: On Tue, Jan 19, 2010 at 12:57:30PM -0600, David Young wrote: On Tue, Jan 19, 2010 at 12:57:57PM +0100, Manuel Bouyer wrote: On Tue, Jan 19, 2010 at 12:50:55PM +0100, Christoph Egger wrote: Why are the *FIXUP options disabled by default in x86 kernels? Because on some systems it reprograms the BARs in a way which doesn't work. I'm not sure the kernel can do this in a reasonable and safe way anyway, it would need detailled knowledge of the hardware, which may not be available. What detailed knowledge do you have in mind? For example, device for which the kernel has no drivers, but still have registers mapped in I/O or memory space. I'm sure PC hardware also have a few fun things I don't know about :) [Cutting across from a concurrent thread on another list...] It would be nice if the *FIXUP options could be made runtime configurable, so they could be enabled by 'boot -c' - would allow people to get at them from a stock GENERIC...