Re: panic in iic_search()
On Wed, 11 Nov 2020 19:17:37 +0100, Edgar Fuß wrote: > I updated to netbsd-8 from yesterday (so that's 8.2_STABLE) and a newly > compiled kernel crashes in iic_search(). [...] Interesting - this looks very much like kern/55745 which I filed against netbsd-9. There must have been a pull-up that went to both -8 and -9? For me, switching to HEAD kernel "fixed" the problem. Cheerio, Hauke -- Hauke Fath Grabengasse 57 64372 Ober-Ramstadt Germany
Re: NVMM missing opcode REPE CMPS implementation
On Nov 11, 2020, at 3:23 AM, Reinoud Zandijk wrote: > On Sat, Oct 31, 2020 at 11:16:52AM -0500, Robert Nestor wrote: >> Apologies if this isn’t the proper place to bring this up, but the >> discussion on this brings two questions to mind: >> >> 1) Since the proposed patch isn’t correct and was reverted, and assuming >> there is a problem with this opcode, is there another correct fix coming? > > Working on a patch in libnvmm for it and it ought to work fine now, but i'm > stuggling with writing ATF case codes; they are needed to validate the > emulation before committing it. > >> 2) Is there some code that one can insert locally into NVMM and/or LIBNVMM >> to help catch other possible problems similar to this? > > There are some unhandled cases on Intel support code that can bomb out libnvmm > and thus qemu. They hardly ever come by though and I only see them on one OS > image (OpenServer v5 IIRC) and then consistently at the same place but I have > no idea as to what triggers it there. The VM exit code is > VMCS_EXITCODE_TASK_SWITCH (9) and not handled in nvmm_x86_vmx.c. See section > 27.2.4 of > https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf > >> While NVMM is very robust and runs a lot of other systems, there are some >> that it still stumbles over. I’m sure the vast majority of users don’t care >> about running something like OS X under NVMM for various reasons, it does >> seem to be a real good test of emulation capabilities. Various versions can >> be installed from standard, non-hacked distributions and run successfully >> without hacks or modifications under KVM on Linux (macOS-Simple-KVM comes to >> mind), but not under NVMM. Maybe the reason is that there are similar >> missing opcodes being used that aren’t currently handled by NVMM? > > I have never tried to boot MacOS-X with qemu+nvmm. I think lots of other > things will need to be prepared for qemu to even start. > > Do you know of other possible (more recent) OSs I could try to boot? I’m no expert on any of the VM implementations, but from my experience running various versions of MacOS-X under VirtualBox and KVM on Linux works well enough to install and run basic applications without requiring anything more than the SMC key which qemu supports. Things like game playing, audio and video playback, and bluetooth may be issues that require hacks, but I’ve never bothered with them. My interest in being able to run MacOS-X under NVMM is two-fold; to help identify differences between KVM and NVMM that prevent NVMM from doing what KVM already does, and being able to run some old Mac Apps occasionally that no longer run under recent versions of MacOS-X and/or require older Mac HW. Being able to run these in a VM environment on NetBSD would be ideal for me, although I could already do that under KVM on Linux, I’d just prefer not having to. The only other OSs I’ve tried are versions of Windows (-95, -98, -XP), Linux (Mint, Ubuntu, SUSE), Solaris and of course NetBSD and FreeBSD. Windows-95 almost works but eventually fails (for me at least) because of timing issues I think which are there in all VMs so it’s not an NVMM issue. Linux works fine once you get past the ACPI installation issue. Solaris-10 seems to work but Solaris-11 doesn’t but I think that’s an issue with how that last version of Solaris was mucked up. And of course NetSBD and FreeBSD run without any serious issues. Not being experienced in this, my approach has been to try and get something running in MintLinux using KVM, and if it runs there I try to run it under NetBSD with NVMM. I’ve tried XEN under MintLinx for some of the same experiments, but haven’t had a lot of success. I think on my system it’s some sort of configuration issue. I haven’t tried XEN under NetBSD mainly because I haven’t taken the time to try to figure out the manual configurations I need. Tried bypassing that by using VirtManager, but it doesn’t work (for me) on NetBSD - some issue about missing python scripts or modules. MacOS-Simple-KVM does work for installation and running of MacOS-X versions 10.13 thru 10.16 under KVM. (MacOS-X 10.6 Big-Sur is current I believe.) The scripts used don’t require having the MacOS media as it fetches what it needs from the Apple Servers. I have an older script that works with 10.6 and 10.9 (the two versions I tested), but it requires that one has the official Apple distribution media to install from. None of these require any special setup or hacks other than inserting the SMC key into the qemu invocation command which I believe is embedded in the MacOS-Simple_KVM scripts. Hope this answers your questions and I look forward to doing more testing with your latest changes to NVMM. -bob
panic in iic_search()
I have an AMD64 server running 8/amd64, which ran happily (other than USB issues, which is another story) with 8.1_STABLE from September 2019. I updated to netbsd-8 from yesterday (so that's 8.2_STABLE) and a newly compiled kernel crashes in iic_search(). The last line printed before that is: iic0 at piixpm0: I2C bus With the working kernel, the next line is: spdmem0 at iic0 addr 0x50: NT4GC72B4NA1NL-CG Obviously, I have the spdmem* at iic? addr 0xxx lines uncommented in my config. The panic is: uvm_fault(0x90afec40, 0x0, 4) -> e fatal page fault in supervisor mode trap type 6 code 0x10 rip 0 cs 0x8 rflags 0x10246 cr2 0 ilevel 0x8 rsp 0x80d4f485 curlwp 0x80a1b600 pid 0.1 lowest kstack 0x80d4c2c0 kernel: page fault trap, code=0 Stopped in pid 0.1 (system) at 0:uvm_fault(0x80afec40, 0x7fbfc000, 1) -> e fatal page fault in supervisor mode trap type 6 code 0 rip 0x80d4f070 cs 0x8 rflags 0x10216 cr2 0x7fbfc000 ilevel 0x8 rsp 0x80d4f070 curlwp 0x80a1b600 pid 0.1 lowest kstack 0x80d4c2c0 kernel: page fault trap, code=0 Stopped in pid 0.1 (system) at netbsd:db_disasm+0x65: testb $0x1,0(%rdx,%rcx,8) Backtrace: db_disasm() at netbsd:db_disasm+0x65 db_trap() at netbsd:db_trap+0xf4 kpd_trap() at netbsd:kpd_trap+0xe2 trap() at netbsd:trap+0x5d6 -- trap (number 6) --- ?() at 0 iic_search() at netbsd:iic_search+0x92 mapply() at netbsd:mapply+0x39 config_search_loc() at netbsd:config_search_loc+0xaf iic_attach() at netbsd:iic_attach+0x4cd config_attach_loc() at netbsd:config_attach_loc+0x19c config_found_sm_loc() at netbsd:config_found_sm_loc+0x48 piixpm_rescan() at netbsd:piixpm_rescan+0xed piixpm_attach() at netbsd:piixpm_attach+0x1e7 config_attach_loc() at netbsd:config_attach_loc+0x19c config_found_sm_loc() at netbsd:config_found_sm_loc+0x48 pci_probe_device() at netbsd:pci_probe_device+0x57e pci_enumerate_bus() at netbsd:pci_enumerate_bus+0x198 pciattach() at netbsd:pciattach+0x198 config_attach_loc() at netbsd:config_attach_loc+0x19c config_found_sm_loc() at netbsd:config_found_sm_loc+0x48 mp_pci_scan() at netbsd:mp_pci_scan+0x9c mainbus_attach() at netbsd:mainbus_attach+0x2ce config_attach_loc() at netbsd:config_attach_loc+0x19c cpu_configure() at netbsd:cpu_configure+0x2b main() at netbsd:main+0x2a8 Where to go from here?
Re: amd64 9.1, pre-wscons text colours?
> you mean you don't get a graphic console right away on that machine? For what value of "right away"? Autoconf starts printing[%] in (what looks like the) same text mode used by the bootloader - no visible mode-change glitch (it's connected to a flatscreen on which mode-change glitches are really obvious, about a half-second of blank screen). It takes a few seconds, and lots of lines of output[$], before DRMKMS takes over and switches the hardware into graphics mode. So, depending on what "right away" means here, that might be yes or might be no. [%] In whatever colours are selected (where light-brown-on-blue, presumably among other colour combinations, gets mapped to white-on-black by some mechanism, as discussed upthread). [$] Based on comparing a very quick glimpse of the last output before the mode switch against /var/run/dmesg.boot after it's up, about 320 lines. > Is this BIOS related? I don't know. I saw someone saying something about it possibly being UEFI-related. The BIOS splash screen talks about UEFI, but the way the disk is set up is plain old MBR, nary a GPT in sight, so presumably it's booting via non-UEFI ("legacy") mechanisms. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Temporary memory allocation from interrupt context
> On Nov 11, 2020, at 5:38 AM, Martin Husemann wrote: > > Yes, and of course the real code has that (and works). It's just that > - memoryallocators(9) does not cover this case > - kmem_intr_alloc(9) is kinda deprecated - quoting the man page: > > These routines are for the special cases. Normally, > pool_cache(9) should be used for memory allocation from interrupt > context. > > but how would I use pool_cache(9) here? It's not "deprecated" per se. Heck, kmem_intr_alloc() was added *after* the pool cache API was added :-). Sounds to me like memoryallocators(9) needs to be combed through and updated. Anyway, I think what the documentation is trying to convey is that "pool_cache is better if you are allocating and freeing fixed size objects in a hot code path". However, you're not allocating fixed-size objects, so using pool_cache directly is not appropriate. Using kmem_intr_alloc() is preferable to rolling your own logic here, and gets you the optimal behavior for this use case. -- thorpej
Re: Temporary memory allocation from interrupt context
> On Nov 11, 2020, at 1:44 AM, Martin Husemann wrote: > > In a perfect world we would avoid the interrupt allocation all together, but > I have not found a way to rearrange things here to make this feasible. > > Is kmem_intr_alloc(9) the best way forward? While softints are backed by threads these days, kmem_intr_alloc() is the API to use in this scenario. (As an aside, kmem_alloc() is itself actually a wrapper around kmem_intr_alloc() that merely asserts that you're not in hard- or soft-interrupt context ... historically, there used to be a real distinction, because they were backed by different VM maps with different locking protocols ... these days, it's all backed by a vmem arena). However, it is worth noting that softint threads are ONLY allowed to sleep if blocking on a mutex or rwlock; sleeping for memory allocation, or a condvar or whatever is not allowed (this is a policy decision rooted in the fact that any given softint thread can only be processing one softint at a time, and we want to prevent starvation). So, because you can't sleep, you must pass the KM_NOSLEEP flag to kmem_intr_alloc() (there's already an assertion to ensure the caller has passed exactly one of KM_SLEEP *or* KM_NOSLEEP, but we should probably add an ASSERT_SLEEPABLE() in the KM_SLEEP case to catch errors like this). If the size fits into one of the kmem_cache or kmem_cache_big buckets, the allocation *will* come out of a pool_cache, and figuring out which pool cache to use is pretty quick, so you're not being penalized too badly here for not knowing the size ahead of time. -- thorpej
Re: Temporary memory allocation from interrupt context
On Wed, Nov 11, 2020 at 03:08:12PM +0100, Joerg Sonnenberger wrote: > On Wed, Nov 11, 2020 at 10:44:45AM +0100, Martin Husemann wrote: > > Consider the following pseudo-code running in softint context: > > Why do those items not have a link element inside, so that no additional > memory allocation is necesary? That would not help - I am collecting a subset of the items and don't want to keep the whole state locked for all actions on them. A single list element inside would not be enough (they do have one, that is how the whole list works). I could create a reference struct (tuple of pointer and tailq entry) for each one collected and put that in a tailq, and then use a pool_cache(9) for the referencers - which would make the whole thing similariy akward as the kmem_intr_* variant. Martin
Re: Temporary memory allocation from interrupt context
On Wed, Nov 11, 2020 at 10:44:45AM +0100, Martin Husemann wrote: > Consider the following pseudo-code running in softint context: Why do those items not have a link element inside, so that no additional memory allocation is necesary? Joerg
Re: Temporary memory allocation from interrupt context
Martin Husemann writes: > On Wed, Nov 11, 2020 at 08:26:45AM -0500, Greg Troxel wrote: >> >LOCK(st); >> >size_t n, max_n = st->num_items; >> >some_state_item **tmp_list = >> >kmem_intr_alloc(max_n * sizeof(*tmp_list)); >> >> kmem_intr_alloc takes a flag, and it seems that you need to pass >> KM_NOSLEEP, as blocking for memory in softint context is highly unlikely >> to be the right thing. > > Yes, and of course the real code has that (and works). It's just that > - memoryallocators(9) does not cover this case > - kmem_intr_alloc(9) is kinda deprecated - quoting the man page: > > These routines are for the special cases. Normally, > pool_cache(9) should be used for memory allocation from interrupt > context. > >but how would I use pool_cache(9) here? Not deprecated, but for "special cases". I think needing a possibly-big variable-size chunk of memory at interrupt time is special. You would use pool_cache by being able to use a fixed-sized object. But it seems that's not how the situation is. I think memoryallocators(9) could use some spiffing up; it (on 9) says kmem(9) cannot be used from interrupt context. The central hard problem is orthogonal, though: if you don't pre-allocate, you have to choose between waiting and copying with failure. signature.asc Description: PGP signature
Re: Temporary memory allocation from interrupt context
On Wed, Nov 11, 2020 at 08:26:45AM -0500, Greg Troxel wrote: > > LOCK(st); > > size_t n, max_n = st->num_items; > > some_state_item **tmp_list = > > kmem_intr_alloc(max_n * sizeof(*tmp_list)); > > kmem_intr_alloc takes a flag, and it seems that you need to pass > KM_NOSLEEP, as blocking for memory in softint context is highly unlikely > to be the right thing. Yes, and of course the real code has that (and works). It's just that - memoryallocators(9) does not cover this case - kmem_intr_alloc(9) is kinda deprecated - quoting the man page: These routines are for the special cases. Normally, pool_cache(9) should be used for memory allocation from interrupt context. but how would I use pool_cache(9) here? Martin
Re: Temporary memory allocation from interrupt context
Martin Husemann writes: > Consider the following pseudo-code running in softint context: > > void > softint_func(some_state *st, ) > { > LOCK(st); > size_t n, max_n = st->num_items; > some_state_item **tmp_list = > kmem_intr_alloc(max_n * sizeof(*tmp_list)); kmem_intr_alloc takes a flag, and it seems that you need to pass KM_NOSLEEP, as blocking for memory in softint context is highly unlikely to be the right thing. The an page is silent on whether lack of both flags is an error, and if not what the semantics are. (It seems to me it should be an error.) With KM_NOSLEEP, it is possible that the allocation will fail. Thus there needs to be a strategy to deal with that. > n = 0; > for (i : st->items) { > if (!(i matches some predicate)) > continue; > i->retain(); > tmp_list[n++] = i; > } > UNLOCK(st); > /* do something with all elements in tmp_list */ > kmem_intr_free(tmp_list, max_n * sizeof(*tmp_list)); > } > > I don't want to alloca here (the list could be quite huge) and max_n could > vary a lot, so having a "manual" pool of a few common (preallocated) > list sizes hanging off the state does not go well either. I think that you need to pick one of pre-allocate the largest size and use it temporarily be able to deal with not having memory. This leads to hard-to-debug situations if that code is wrong, becuase usually malloc will succeed. figure out that this softint can block indefinitely, only harming later calls of the same family, and not leading to kernel deadlock/etc. This leads to hard-to-debug situations if lack of memory does lead to hangs, because usually malloc will succeed. > In a perfect world we would avoid the interrupt allocation all together, but > I have not found a way to rearrange things here to make this feasible. > > Is kmem_intr_alloc(9) the best way forward? With all that said, note that I'm not the allocation export. signature.asc Description: PGP signature
Re: amd64 9.1, pre-wscons text colours?
On Wed, Nov 11, 2020 at 09:32:26AM +0100, Reinoud Zandijk wrote: > you mean you don't get a graphic console right away on that machine? Is this > BIOS related? With UEFI you start with genfb. With BIOS there are three possibilities (ignoring the ega driver or pcconsole): - VGA text mode (the default) - VGA graphics mode (if compiled with option VGA_RASTERCONSOLE) - VESA graphics mode with genfb. Greetings, -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: NVMM missing opcode REPE CMPS implementation
On Sat, Oct 31, 2020 at 11:16:52AM -0500, Robert Nestor wrote: > Apologies if this isn’t the proper place to bring this up, but the > discussion on this brings two questions to mind: > > 1) Since the proposed patch isn’t correct and was reverted, and assuming > there is a problem with this opcode, is there another correct fix coming? Working on a patch in libnvmm for it and it ought to work fine now, but i'm stuggling with writing ATF case codes; they are needed to validate the emulation before committing it. > 2) Is there some code that one can insert locally into NVMM and/or LIBNVMM > to help catch other possible problems similar to this? There are some unhandled cases on Intel support code that can bomb out libnvmm and thus qemu. They hardly ever come by though and I only see them on one OS image (OpenServer v5 IIRC) and then consistently at the same place but I have no idea as to what triggers it there. The VM exit code is VMCS_EXITCODE_TASK_SWITCH (9) and not handled in nvmm_x86_vmx.c. See section 27.2.4 of https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf > While NVMM is very robust and runs a lot of other systems, there are some > that it still stumbles over. I’m sure the vast majority of users don’t care > about running something like OS X under NVMM for various reasons, it does > seem to be a real good test of emulation capabilities. Various versions can > be installed from standard, non-hacked distributions and run successfully > without hacks or modifications under KVM on Linux (macOS-Simple-KVM comes to > mind), but not under NVMM. Maybe the reason is that there are similar > missing opcodes being used that aren’t currently handled by NVMM? I have never tried to boot MacOS-X with qemu+nvmm. I think lots of other things will need to be prepared for qemu to even start. Do you know of other possible (more recent) OSs I could try to boot? With regards, Reinoud
Re: amd64 9.1, pre-wscons text colours?
On Wed, Nov 11, 2020 at 05:20:53AM -, Michael van Elst wrote: > VGA has only 8 colors and an intensity attribute. You cannot select > the "light" ANSI colors. The VGA driver will fail attempts to > allocate such color attributes, so you get white on black instead. ... > When DRM takes over, such hardware limitations no longer apply. > The framebuffer code can handle more colors. you mean you don't get a graphic console right away on that machine? Is this BIOS related? Reinoud
Temporary memory allocation from interrupt context
Hey folks, I have the feeling this question is pretty stupid and I must be missing something - so probably I will know a good solution immediately after sending this. I am working on removing malloc(9) calls from some code. We have a great summary of options in memoryallocators(9), but for this situation it has no clear advice (actually the currently working solution below is not blessed by it at all). Consider the following pseudo-code running in softint context: void softint_func(some_state *st, ) { LOCK(st); size_t n, max_n = st->num_items; some_state_item **tmp_list = kmem_intr_alloc(max_n * sizeof(*tmp_list)); n = 0; for (i : st->items) { if (!(i matches some predicate)) continue; i->retain(); tmp_list[n++] = i; } UNLOCK(st); /* do something with all elements in tmp_list */ kmem_intr_free(tmp_list, max_n * sizeof(*tmp_list)); } I don't want to alloca here (the list could be quite huge) and max_n could vary a lot, so having a "manual" pool of a few common (preallocated) list sizes hanging off the state does not go well either. In a perfect world we would avoid the interrupt allocation all together, but I have not found a way to rearrange things here to make this feasible. Is kmem_intr_alloc(9) the best way forward? Martin