2010/5/25 Blue Swirl <blauwir...@gmail.com>: >>> About bugs, IIRC NetBSD 3.x crash could be related to IOMMU. >> >> What does indicate it? It happens where the disk sizes are normally >> reported, so it could be a scsi/dma/irq/fpu issue as well. > > IIRC the DVMA address was 0xfc004000, but the mapped entries were for > 0xfc000000 to 0xfc003fff.
Hmm. It happens in all NetBSD versions from 1.6 to 3.1 inclusive. Which is probably a sign that the problem resides in qemu and not in NetBSD. It looks like we have multiple problems here: they start with 0xfc004000 access (which can theoretically be expected on the real hardware too) as you pointed out, but what happens afterwards is strange too: - In the current qemu implementation we have a screaming NMI which NetBSD can not clear. This happens cause NMI in qemu is literally non-maskable, while on the real hardware it can be masked with the 'mask all' flag. I'll send a patch for it. - with the masking patch, the NMI is not screaming but still is percepted as spurious. This may be ok if NetBSD (1.6-3.1) doesn't have a moduleerr_handler set. I don't see it set, although there is a moduleerr_handler variable is defined. Would be nice if someone with NetBSD kernel knowledge would comment on this. - the current implementation of NMI pending clearing in qemu may be incomplete: if the source is not cleared, the pending NMI bit can be set right after the user wrote to the clear pending register. If I read the page 39 of the Sun4m System Architecture correctly, the points (2) and (6) suggest that once pending NMI is cleared in a CPU, the CPU doesn't see the NMI till the next one comes. Whereas it makes sense, specially in SMP configurations, I don't see how is it level-sensitive in this case. If 'another broadcast' means a new external event, the NMI behavior seems mixed: turned on by edge, and off by level. But the documentation says 'level-sensitive' everywhere. Ideas? -- Regards, Artyom Tarasenko solaris/sparc under qemu blog: http://tyom.blogspot.com/