Re: OpenBSD 7.2 on Oracle Cloud

2023-04-30 Thread Stefan Fritsch

Hi,

what qemu version are you using? I cannot reproduce this with qemu 7.2. 
Can you try with a newer qemu?


Cheers,
Stefan

Am 25.04.23 um 14:53 schrieb Aaron Mason:

Yeah I'm getting the same thing. Trying a build in QEMU and
transferring in to see if that helps. Will report back.



Ok, good news, it still crashes at the same spot, but this time I've
got more data. Copying in tech@ - if I've forgotten anything let me
know and I'll fire up a fresh instance.

[REDACTED]
vioscsi_req_done(e,80024a00,fd803f81c338,e,80024a00,800
d3228) at vioscsi_req_done+0x26
[REDACTED]


Ok, so based on the trace I got, I was able to trace the stop itself
back to line 299 of vioscsi.c (thank. you. random relink. And
anonymous CVS):

293  vioscsi_req_done(struct vioscsi_softc *sc, struct virtio_softc *vsc,
294  struct vioscsi_req *vr)
295  {
296  struct scsi_xfer *xs = vr->vr_xs;
297  DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs);
298
-->299  int isread = !!(xs->flags & SCSI_DATA_IN);
300  bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
301  offsetof(struct vioscsi_req, vr_req),
302  sizeof(struct virtio_scsi_req_hdr),
303  BUS_DMASYNC_POSTWRITE);

Maybe if I follow the rabbit hole enough, I might find out what's
going wrong between the driver and OCI. I've got a day off tomorrow
(yay for war I guess), I'll give it a bash and see where we end up.

--
Aaron Mason - Programmer, open source addict
I've taken my software vows - for beta or for worse


I enabled debugging on the vioscsi driver, rebuilt the RAMDISK kernel
with those drivers enabled, and got this:

vioscsi0 at virtio1: qsize 128
scsibus0 at vioscsi0: 255 targets
vioscsi_req_get: 0xfd803f80d338
vioscsi_scsi_cmd: enter
vioscsi_scsi_cmd: polling...
vioscsi_scsi_cmd: polling timeout
vioscsi_scsi_cmd: done (timeout=0)
vioscsi_scsi_cmd: enter
vioscsi_scsi_cmd: polling...
vioscsi_vq_done: enter
vioscsi_vq_done: slot=127
vioscsi_req_done: enter vr: 0xfd803f80d338 xs: 0xfd803f8a5e58
vioscsi_req_done: done 0, 2, 0
vioscsi_vq_done: slot=127
vioscsi_req_done: enter vr: 0xfd803f80d338 xs: 0x0
uvm_fault(0x813ec2e0, 0x8, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip 810e6190 cs 8 rflags 10286 cr2 8 cpl e
rsp 81606670
gsbase 0x813dfff0  kgsbase 0x0
panic: trap type 6, code=0, pc=810e6190

That "xs: 0x0" bit feels like a clue. It should be trivial to pick up
and handle, but what would be the correct way to handle that?

If I have it return if "xs" is found to be NULL, it continues - the
debugging suggests it goes through each possible target before
finishing up. I don't know if that's correct, but it seems to continue
booting after that even if my example didn't detect the drive with the
kernel I built (I used the RAMDISK kernel and it was pretty stripped
down).

I'm about to attempt a -STABLE build (I've got 7.3 installed and thus
can't yet build a snapshot, but I will do that if this test succeeds)
- here's the patch that hopefully fixes the problem. (and hopefully
gmail doesn't clobber the tabs)

Index: sys/dev/pv/vioscsi.c
===
RCS file: /cvs/src/sys/dev/pv/vioscsi.c,v
retrieving revision 1.30
diff -u -p -u -p -r1.30 vioscsi.c
--- sys/dev/pv/vioscsi.c 16 Apr 2022 19:19:59 - 1.30
+++ sys/dev/pv/vioscsi.c 25 Apr 2023 12:51:16 -
@@ -296,6 +296,7 @@ vioscsi_req_done(struct vioscsi_softc *s
   struct scsi_xfer *xs = vr->vr_xs;
   DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs);

+ if (xs == NULL) return;
   int isread = !!(xs->flags & SCSI_DATA_IN);
   bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
   offsetof(struct vioscsi_req, vr_req),






Re: Performance issues as KVM guest?

2018-01-12 Thread Stefan Fritsch
Hi,

I don't see this issue on my Debian system, but please try two things:

* disable kvm_intel.preemption_timer on the host 
(see /sys/module/kvm_intel/parameters/preemption_timer )
This seems to be buggy in linux 4.10 and newer

* enable hpet in the vm config:
Make sure there is no   in your libvirt 
xml (or don't pass -ho-hpet to qemu). Unfortunately, newer libvirt 
versions seem to disable hpet by default.


Different issue: If you remove the USB controllers, the CPU load on 
the host will reduce by a few percent (~ 3%). Add



and remove all other usb controller sections. Just removing the usb 
controller sections without adding the 'none' makes libvirt add them back 
(this is stupid).

Cheers,
Stefan

On Fri, 12 Jan 2018, Infoomatic wrote:

> Same problem here. While we did have significant differences in cpu 
> usage between FreeBSD and OpenBSD (basic OS without configuration: 
> FreeBSD ~ 33min CPU time, OpenBSD ~ 474min CPU time - both started at 
> the same time), with the latest kernel patches for Ubuntu 17.04 (our 
> test environments all run Ubuntu 17.04 for KVM VMs), OpenBSD now becomes 
> practically unusable: as soon as I su or login on the console with su, 
> cpu usage is at 100% - the system freezes. :-/ guess we need some 
> dedicated BSD machines to host some test-VMs ;-)
> 
> Regards,
> Robert
> 
> 
> > Gesendet: Donnerstag, 11. Januar 2018 um 20:32 Uhr
> > Von: "Kirill Miazine" 
> > An: misc@openbsd.org
> > Betreff: Re: Performance issues as KVM guest?
> >
> > * Kent Watsen [2018-01-11 17:38]:
> > [...]
> > > > > Since my hosting provider https://www.bytemark.co.uk/cloud-hosting/
> > > > > patched for Meltdown last weekend I'm seeing significant performance
> > > > > issues with an OpenBSD virtual instance there. It seems okay after a
> > > > > fresh reboot but then progressively returns to being very slow: for
> > > > > example "sleep 1" may take four seconds, then five, six, seven, then
> > > > > rather more. Curiously it does tend to be an integral multiplier.
> > > > > 
> > > > > I wondered, is anybody else seeing significant performance problems 
> > > > > with
> > > > > OpenBSD (or other BSDs) virtual instances since Meltdown patching? Is
> > > > > there anything to tweak at my end or am I reliant on the provider?
> > > > > 
> > > > > -- Mark
> > > > > 
> > > > There are a ton of threads talking about this issue, and it's not 
> > > > meltdown
> > > > specific. Please search the archives.
> > > > 
> > > > -ml
> > > > 
> > [...]
> > > Also, Mark, could you say some more about the issue.  For instance, how 
> > > long
> > > after a reboot does it take until you start to notice the issue, and how
> > > quickly does it get worse?
> > 
> > I'm another customer of Bytemark experiencing the same issue. I'm taking
> > care of one VM there and I'm primarly noticing it in two situations:
> > sleep() takes a long time (e.g. sleep(1) might take up to 40 seconds)
> > and the clock slows down.
> > 
> > Right now, 9 hours after reboot, the clock on VM is 3 hours behind real
> > clock. And sleep(1) takes 13 secs:
> > 
> > km@buildfarm ~ $ time sleep 1
> > 0m13.85s real 0m00.00s user 0m00.01s system
> > 
> > This all started after the host was patched and VM rebooted.
> > 
> > Bytemark guys are looking at the issue and doing their own debugging.
> > Here're findings so far:
> > 
> > I spun a few OpenBSD VMs up and left them overnight - looks like the
> > clock isn't drifting but there's still the 'time sleep 1' issue.
> > My testing results seemed to concur with User_4574's, virtio was slowing
> > down only a few minutes after a fresh install whereas compatibility
> > would stick at 1s, jump to 2s, etc. 
> >
> > > 
> > > Thanks,
> > > Kent
> > > 
> > 
> > -- 
> > -- Kirill Miazine 
> > 
> >
> 
> 


Re: em0: Hardware Initialization Failed

2017-12-01 Thread Stefan Fritsch
Hi Jan,

On Sat, 11 Nov 2017, Jan Stary wrote:

> This is current/amd64 on a Dell Latitude E5570 (dmesg below).
> When booting without the ethernet cable plugged in,
> the boot sequence finishes with the following message:
> 
>   em0: Hardware Initialization Failed
>   em0: Unable to initialize the hardware

We had similar problems with some HP laptops.

> When I boot with the cable plugged in, everything works as expected,
> like it always has. But now it seems that the ethernet cable _must_
> be plugged in at boot, otherwise em0 will just not work.
> 
> Can somebody with em(4) reproduce?
> How can I debug this?

Can you please try if the patch below helps?

If yes, can you please also try without the msec_delay line after the
"Magic delay ..." comment? Note that in our case, it without that
delay, it would work most of the time but not always. So you will have
to try it several times (10 ... 20) to be sure that it's reliable.

I have only tested the patch with older openbsd releases, but I expect
it works on current, too.

Cheers,
Stefan

commit aa7c279debd5c66e1d2a0b3c18ceb20ef32ce7b7
Author: Stefan Fritsch <s...@sfritsch.de>
Date:   Fri Dec 1 09:56:58 2017 +0100

34236: em: Fixes/workarounds for em on HP laptops

Some em chips have a semaphore ("software flag") to synchronize access
to certain registers between OS and firmware (ME/AMT).

Make the logic to get the flag match the logic in freebsd. This includes
higher timeouts and waiting for a previous unlock to complete before
trying a lock again.

Another problem was that openbsd em driver calls em_get_software_flag
recursively, which causes the semaphore to be unlocked too early. Make
em_get_software_flag/em_release_software_flag handle this correctly.
Freebsd does not do this, but they have a mutex that probably allows
them to detect recursive calls to e1000_acquire_swflag_ich8lan().
Reworking the openbsd driver to not recursively get the semaphore would
be very invasive.

Also port the logic from freebsd to em_check_phy_reset_block(). A single
read does not seem to be reliable.

Also, increase delay after reset to 20ms, which is the value in freebsd
for ich8lan.

The changes so far make things more reliable, but not 100%. Add another
1ms delay that seems to help with the remaining #34195 problems on HP
elitebook.  A printf() at the same place helps, too.

While there, print mac+phy type in em_attach(), and em_init_hw() error
code if something goes wrong.

diff --git a/sys/dev/pci/if_em.c b/sys/dev/pci/if_em.c
index 985a464aaf9..5b6f3479bf5 100644
--- a/sys/dev/pci/if_em.c
+++ b/sys/dev/pci/if_em.c
@@ -545,6 +545,8 @@ em_attach(struct device *parent, struct device *self, void 
*aux)
if (!defer)
em_update_link_status(sc);
 
+   printf(", mac_type %#x phy_type %#x ", sc->hw.mac_type,
+   sc->hw.phy_type);
printf(", address %s\n", ether_sprintf(sc->sc_ac.ac_enaddr));
 
/* Indicate SOL/IDER usage */
@@ -1847,8 +1849,8 @@ em_hardware_init(struct em_softc *sc)
INIT_DEBUGOUT("\nHardware Initialization Deferred ");
return (EAGAIN);
}
-   printf("\n%s: Hardware Initialization Failed\n",
-  DEVNAME(sc));
+   printf("\n%s: Hardware Initialization Failed: %d\n",
+  DEVNAME(sc), ret_val);
return (EIO);
}
 
diff --git a/sys/dev/pci/if_em_hw.c b/sys/dev/pci/if_em_hw.c
index bd94aca904b..c2aa43ed342 100644
--- a/sys/dev/pci/if_em_hw.c
+++ b/sys/dev/pci/if_em_hw.c
@@ -929,7 +929,9 @@ em_reset_hw(struct em_hw *hw)
}
em_get_software_flag(hw);
E1000_WRITE_REG(hw, CTRL, (ctrl | E1000_CTRL_RST));
-   msec_delay(5);
+   /* HW reset releases software_flag */
+   hw->sw_flag = 0;
+   msec_delay(20);
 
/* Ungate automatic PHY configuration on non-managed 82579 */
if (hw->mac_type == em_pch2lan && !hw->phy_reset_disable &&
@@ -1473,6 +1475,8 @@ em_init_hw(struct em_hw *hw)
/* Set the media type and TBI compatibility */
em_set_media_type(hw);
 
+   /* Magic delay that improves problems with i219LM on HP Elitebook */
+   msec_delay(1);
/* Must be called after em_set_media_type because media_type is used */
em_initialize_hardware_bits(hw);
 
@@ -9504,9 +9508,18 @@ em_check_phy_reset_block(struct em_hw *hw)
DEBUGFUNC("em_check_phy_reset_block\n");
 
if (IS_ICH8(hw->mac_type)) {
-   fwsm = E1000_READ_REG(hw, FWSM);
-   return (fwsm & E1000_FWSM_RSPCIPHY) ? E1000_SUCCESS :
-

Re: KVM / Proxmox Hosted OpenBSD Boxes Multiqueue Virtio Query

2017-11-30 Thread Stefan Fritsch
On Friday, 1 December 2017 02:27:53 CET Tom Smyth wrote:
> Hello All
> I havent seen much by way of advice about multiqueue virtio
> support on OpenBSD and I was wondering do other users use it ?
> does anyone have experience with setting the number of virtio
> queues in Proxmox for an OpenBSD guest ?
> It is suggested by
> proxmox  / KVM to set the number of Queues presented to a vm
> to be = the number of vCPUs you have assigned to the Guest.

openbsd does not yet support multiqueue for virtio and it does not make much 
sense to add that until the network stack is more parallel.

Cheers,
Stefan



Re: Banana-Pi R2

2017-09-08 Thread Stefan Fritsch
On Wednesday, 6 September 2017 19:18:49 CEST Rui Ribeiro wrote:
> I once booted netbsd in my Banana Pi/Lamobo R1, which is a similar machine
> from "the same manufacturer"; the bigger problem is that outside Linux,
> there is no support for the Broadcom switching chipset.

The R2 is a completely different board (different SoC, different switch chip, 
both from mediatek). So in this case it's not only the switch chip but also 
the SoC that is not supported by openbsd.



Re: fd0 at fdc0 drive 0: density unknown

2017-09-08 Thread Stefan Fritsch
On Thursday, 7 September 2017 19:15:31 CEST Arfnokill wrote:
> Using snapshots on amd64. Since two days ago the kernel prints this fd0 at
> fdc0 drive 0: density unknown very late during boot.
> 
> It starts reordering libraries, and BAM... fd0 at fdc0 drive 0: density
> unknown in blue background. It's just cosmetic I guess, but it's
> uncomfortable.
> 
> Anybody else seeing this with recent snapshots?

The old behavior was that the kernel would wait after the "fdc0 ..." line 
until fd0 attaches. Now it does the waiting in the background and continues 
booting. I agree that it's a bit ugly, but it makes booting about 5 seconds 
faster.



Re: Panic booting snapshot image on QEMU virtual machine

2017-08-14 Thread Stefan Fritsch
On Tue, 15 Aug 2017, Stefan Fritsch wrote:

> On Mon, 14 Aug 2017, msheremet wrote:
> > I get kernel panic trying to boot snapshot image install61.iso on the
> > amd64 QEMU virtual machine. the problem started to happen with late
> > July snapshots. And it still happens with the latest snapshot I can
> > obtain (Aug 11). Here is the output of the boot process:
> 
> Try to choose a different emulated cpu in qemu. For this cpu type, openbsd 
> tries to make special MSR settings to work around cpu erratas, but qemu 
> does not emulate those MSRs. At least that was the problem with some other 
> similar panic I got reported.

Actually I hit send a bit too quickly. The panic looks different, so I 
don't know if it's the same issue. But another report said that qemu -cpu 
Opteron_G3 caused a panic but Opteron_G2 worked.

> > cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> > cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> > uvm_fault(0x81872dd0, 0xfff78183f2b0, 0, 1) -> e
> > fatal page fault in supervisor mode
> > trap type 6 code 0 rip 811c17a0 cs 8 rflags 10286 cr2
> > fff78183f2b0 cpl e rsp 81a058a0
> > panic: trap type 6, code=0, pc=811c17a0
> > 
> > The operating system has halted.
> > Please press any key to reboot.
> > 
> > 
> > 6.1 release image and earlier snapshots boot fine.
> > 
> > Could you please shed some light on this issue for I cannot find the
> > reason myself.
> > 
> > Thanks,
> > Maksym
> > 
> > 
> 
> 



Re: Panic booting snapshot image on QEMU virtual machine

2017-08-14 Thread Stefan Fritsch
On Mon, 14 Aug 2017, msheremet wrote:
> I get kernel panic trying to boot snapshot image install61.iso on the
> amd64 QEMU virtual machine. the problem started to happen with late
> July snapshots. And it still happens with the latest snapshot I can
> obtain (Aug 11). Here is the output of the boot process:

Try to choose a different emulated cpu in qemu. For this cpu type, openbsd 
tries to make special MSR settings to work around cpu erratas, but qemu 
does not emulate those MSRs. At least that was the problem with some other 
similar panic I got reported.

Cheers,
Stefan

> 
> > > OpenBSD/amd64 CDBOOT 3.28
> boot> boot /6.1/amd64/bsd.rd
> cannot open cd0a:/etc/random.seed: No such file or directory
> booting cd0a:/6.1/amd64/bsd.rd: 3361576+1459200+3877592+0+602112
> [72+426552+281440]=0x98d060
> entry point at 0x1000158
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2017 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 6.1-current (RAMDISK_CD) #1: Fri Aug 11 21:30:26 MDT 2017
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
> real mem = 1056833536 (1007MB)
> avail mem = 1021108224 (973MB)
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf68e0 (9 entries)
> bios0: vendor SeaBIOS version "1.10.2-20170228_101828-anatol" date 04/01/2014
> bios0: QEMU Standard PC (i440FX + PIIX, 1996)
> acpi0 at bios0: rev 0
> acpi0: tables DSDT FACP APIC
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: AMD Opteron 23xx (Gen 3 Class Opteron), 3415.63 MHz
> cpu0:
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,x2APIC,POPCNT,HV,NXE,LONG,LAHF,ABM,SSE4A,MASSE
> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line
> 16-way L2 cache, 16MB 64b/line 16-way L3 cache
> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> uvm_fault(0x81872dd0, 0xfff78183f2b0, 0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip 811c17a0 cs 8 rflags 10286 cr2
> fff78183f2b0 cpl e rsp 81a058a0
> panic: trap type 6, code=0, pc=811c17a0
> 
> The operating system has halted.
> Please press any key to reboot.
> 
> 
> 6.1 release image and earlier snapshots boot fine.
> 
> Could you please shed some light on this issue for I cannot find the
> reason myself.
> 
> Thanks,
> Maksym
> 
> 



Re: OpenBSD 6.1 installation, on dedicated server, using qemu not working.

2017-07-28 Thread Stefan Fritsch
On Tuesday, 25 July 2017 21:30:08 CEST Mxher wrote:
> I'm renting a dedicated server from a web host that unfortunately does
> not propose OpenBSD installation.
> 
> So I'm installing OpenBSD using qemu from my host rescue mode (which use
> FreeBSD).
> 
> 
> Usually it works like a charm but this time, on this server/hardware, it
> does not work: OpenBSD does not seem to start at all.
> Indeed when I boot with qemu I do not see any logs of the "normal" boot
> of the server (I only see qemu's boots in the logs).

Maybe I misunderstand what you are trying to do, but: There is sgabios for 
redirecting vga text output to serial console in qemu. Maybe that could help 
somehow? Or try using VNC console in qemu. Are you seeing the openbsd 
bootloader prompt? Are you then setting the console correctly in the openbsd 
bootloader?



Re: vio(4) stops working with debian 9.0 qemu-2.8+dfsg-6

2017-07-10 Thread Stefan Fritsch
On Saturday, 8 July 2017 14:58:59 CEST Stefan Fritsch wrote:
> A difference between i386 and amd64 is that on amd64, openbsd uses MSI-X for
> virtio. Maybe legacy interrupts are broken with vhost-net. This needs some
> more debugging. But its either a bug in qemu or in the linux kernel, not in
> openbsd.

It's also broken with amd64 if MSI-X is disabled.

It's fixed in qemu 2.9. Debian bug report: 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=867978

Cheers,
Stefan



Re: vio(4) stops working with debian 9.0 qemu-2.8+dfsg-6

2017-07-08 Thread Stefan Fritsch
Hi Karsten,

On Wednesday, 5 July 2017 12:51:26 CEST Karsten Horsmann wrote:
> found an very strange problem, which seems more to be an
> qemu/kvm issue. Maybe one of you knows what happens and
> give me an hint to solved this (with vio).
> 
> I use an debian 9.0 kvm/qemu setup with bridge-networking.
> 
> After i upgrade from debian 8.0 to 9.0 i run into a problem
> with my OpenBSD 6.0 i386, OpenBSD 6.1 i386 guests and the
> vio(4) driver.

That's weird. But I don't use i386 much anymore, so I didn't notice it so far. 
It seems qemu does not send rx interrupts to the guest. It does send tx 
interrupts, though. But the problem seems to occur only if vhost is used. As a 
workaround set driver to 'qemu' (instead of 'vhost') in the libvirt xml:


  
  
  
  
  


A difference between i386 and amd64 is that on amd64, openbsd uses MSI-X for 
virtio. Maybe legacy interrupts are broken with vhost-net. This needs some 
more debugging. But its either a bug in qemu or in the linux kernel, not in 
openbsd.

Cheers,
Stefan



Re: how to debug OpenBSD virtio-scsi killing qemu-kvm VM?

2017-03-16 Thread Stefan Fritsch
On Tuesday, 14 March 2017 20:16:17 CET Jiri B wrote:
> Recent dmesg, and VM exits because of virtio-scsi issue when it is
> installing 'bsd.mp'.

I think I have fixed all the bugs, at least I could not get any corruption any 
more. The changes are in -current, in r1.5 of sys/dev/pv/vioscsi.c . Please 
try if that fixes your problems.

Cheers,
Stefan



Re: how to debug OpenBSD virtio-scsi killing qemu-kvm VM?

2017-03-14 Thread Stefan Fritsch
Hi,

On Mon, 13 Mar 2017, Jiri B wrote:
> 
> it seems virtio-scsi is not working correctly in OpenBSD, I gave it
> a try today and OpenBSD VM was killed with:
> 
>   2017-03-13T15:29:00.814657Z qemu-kvm: wrong size for virtio-scsi headers
> 
> on EL7 with qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64.
> 
> I found a bug stating it is OpenBSD's fault
>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=768517
> 
> I'd like to provide more info but could you give me some hints
> please? I tried to attach debugger to qemu-kvm process but I get
> only this :/


last time I looked at this I found a bug how vioscsi uses 
virtio_dequeue_commit(). After that was fixed, it did not cause qemu to 
complain anymore but there were occasional data errors that would result 
in filesystem corruption. I don't have a really good idea how to debug 
that. Maybe write a test program that writes known patterns to the raw 
disk and then reads them again and shows the diffs. Or does anyone know a 
program that does this? In any case, I ran out of time before I got any 
further.

The attached diff fixes the bug and adds lots of debug output. If you 
comment out the printfs, you could try if you have more luck. But this is 
probably more a topic for tech@

Cheers,
Stefan

diff --git sys/dev/pv/vioscsi.c sys/dev/pv/vioscsi.c
index 6a8bb55..1783bd5 100644
--- sys/dev/pv/vioscsi.c
+++ sys/dev/pv/vioscsi.c
@@ -32,12 +32,23 @@
 enum { vioscsi_debug = 0 };
 #define DPRINTF(f...) do { if (vioscsi_debug) printf(f); } while (0)
 
+#define STATE_ASSERT(vr, want) do {\
+   if (vr->vr_state != want) { \
+   panic("%s:%d: vr_state is %d should be %d\n",   \
+   __func__, __LINE__, vr->vr_state, want);\
+   } \
+   } while (0)
+
+
+enum vioscsi_req_state { FREE, ALLOC, INQUEUE, DONE };
+
 struct vioscsi_req {
struct virtio_scsi_req_hdr   vr_req;
struct virtio_scsi_res_hdr   vr_res;
struct scsi_xfer*vr_xs;
bus_dmamap_t vr_control;
bus_dmamap_t vr_data;
+   enum vioscsi_req_state   vr_state;
 };
 
 struct vioscsi_softc {
@@ -166,16 +177,19 @@ vioscsi_scsi_cmd(struct scsi_xfer *xs)
struct virtio_scsi_req_hdr *req = >vr_req;
struct virtqueue *vq = >sc_vqs[2];
int slot = vr - sc->sc_reqs;
+   int ec = 0;
 
DPRINTF("vioscsi_scsi_cmd: enter\n");
+   STATE_ASSERT(vr, ALLOC);
 
// TODO(matthew): Support bidirectional SCSI commands?
if ((xs->flags & (SCSI_DATA_IN | SCSI_DATA_OUT))
== (SCSI_DATA_IN | SCSI_DATA_OUT)) {
+   ec = __LINE__;
stuffup:
xs->error = XS_DRIVER_STUFFUP;
xs->resid = xs->datalen;
-   DPRINTF("vioscsi_scsi_cmd: stuffup\n");
+   printf("vioscsi_scsi_cmd: stuffup l.%d\n", ec);
scsi_done(xs);
return;
}
@@ -187,16 +201,20 @@ vioscsi_scsi_cmd(struct scsi_xfer *xs)
 * 1, second byte set to target, third and fourth byte representing a
 * single level LUN structure, followed by four zero bytes."
 */
-   if (xs->sc_link->target >= 256 || xs->sc_link->lun >= 16384)
+   if (xs->sc_link->target >= 256 || xs->sc_link->lun >= 16384) {
+   ec = __LINE__;
goto stuffup;
+   }
req->lun[0] = 1;
req->lun[1] = xs->sc_link->target;
req->lun[2] = 0x40 | (xs->sc_link->lun >> 8);
req->lun[3] = xs->sc_link->lun;
memset(req->lun + 4, 0, 4);
 
-   if ((size_t)xs->cmdlen > sizeof(req->cdb))
+   if ((size_t)xs->cmdlen > sizeof(req->cdb)) {
+   ec = __LINE__;
goto stuffup;
+   }
memset(req->cdb, 0, sizeof(req->cdb));
memcpy(req->cdb, xs->cmd, xs->cmdlen);
 
@@ -207,16 +225,21 @@ vioscsi_scsi_cmd(struct scsi_xfer *xs)
if (bus_dmamap_load(vsc->sc_dmat, vr->vr_data,
xs->data, xs->datalen, NULL,
((isread ? BUS_DMA_READ : BUS_DMA_WRITE) |
-BUS_DMA_NOWAIT)))
+BUS_DMA_NOWAIT))) {
+   ec = __LINE__;
goto stuffup;
+   }
nsegs += vr->vr_data->dm_nsegs;
}
 
int s = splbio();
int r = virtio_enqueue_reserve(vq, slot, nsegs);
splx(s);
-   if (r)
+   if (r) {
+   ec = __LINE__;
+   printf("nsegs: %d seg_max: %d datalen %d isread %d\n", nsegs, 
sc->sc_seg_max, xs->datalen, isread);
goto stuffup;
+   }
 
bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
offsetof(struct vioscsi_req, vr_req),
@@ -245,6 +268,7 @@ vioscsi_scsi_cmd(struct scsi_xfer *xs)
virtio_enqueue(vq, slot, vr->vr_data, 0);
 

Re: usb disk dirty after every reboot

2016-09-20 Thread Stefan Fritsch
On Tue, 20 Sep 2016, Darren Tucker wrote:

> On Tue, Sep 20, 2016 at 1:43 AM, Jan Stary  wrote:
> >
> > This is current/i386 on an ALIX.1E (demsg below).
> > I have an USB disk connected for /backup.
> >
> > Upon every reboot, the filesystem on that disk is dirty:
> > WARNING: R/W mount of /backup denied.  Filesystem is not clean - run fsck
> 
> 
> I saw something similar on an APU where the root disk was on
> (USB-attached) sdcard:
> http://marc.info/?l=openbsd-misc=144237305322074=2
> 
> It seems to be a race.  There used to be a 4sec pause in the kernel
> that was removed:
> 
> """
> Remove 4 second delay on reboot/shutdown that was added 8 years
> ago to "workaround MP timeout/splhigh/scsi race at reboot time".
> """

I think before we re-add some arbitrary delays, we should check if we are 
actually sending an explicit cache flush command to the disk controllers. 
I have some code somewhere that does this for umount and mount -ur. I will 
look for it and send it to tech@, but probably not today.

Cheers,
Stefan



Re: Passwd cipher for YP

2015-11-19 Thread Stefan Fritsch
I am rather late to this thread...

On Thursday 15 October 2015 15:46:47, Raimo Niskanen wrote:
> > > Are there more password ciphers planned for the future e.g
> > > sha256 and sha512?>
> > 
> >
> > No, we will not be adding those.
> >
> > 
> >
> > Those simple hashes do not provide the future-proof,
> > high-cost-to-crack features of bcrypt, which has made it
> > successful as industry staple. The dumb hashes even arrived years
> > after bcrypt, seems likely the result of choosing ideas "not
> > invented by openbsd"
> 
> Ouch!  And I have not seen any other upcoming ciphers
> mentioned.  These seem to be state of the art in the Linux world :/

... but if anyone wants to add their voice to 
https://sourceware.org/bugzilla/show_bug.cgi?id=16814 , maybe glibc 
could be made to reconsider bcrypt. AFAIK, glibc upstream is mostly 
different people now than when they added sha2 password hashes.



Re: Intel AMT serial-over-LAN with OpenBSD

2015-09-13 Thread Stefan Fritsch
On Tuesday 08 September 2015 07:24:56, Joe Gidi wrote:
> > It is worth pointing out that amtterm is only useable with AMT
> > versions <= 8
> > as AMT version 9 removed the SOAP interface that amtterm uses.  If
> > anyone knows of anything that can talk the ws-man protocol
> > variant AMT version 9 uses I'd like to hear about it.
> 
> This box has AMT 9, actually.
> 
> amtterm-cli, the SOL client, still works with it. However, as you
> said, the other components of the amtterm package don't work due to
> the SOAP interface being deprecated.
> 
> There apparently is some open-source code for wsman, but I don't see
> any sign that it's been ported to OpenBSD:
> 
> https://openwsman.github.io/
> http://en.community.dell.com/techcenter/b/techcenter/archive/2012/08
> /03/wsmancli-package-for-ubuntu

AIUI, the difference between legacy SOL redirection mode and what 
newer AMT versions do is that with the former, the SOL port is open by 
default while with newer version, one needs to enable it with wsman 
first. So, if the ME-BIOS does not offer "Legacy redirection mode", 
one needs to do these magical incantations:

wsman put  http://intel.com/wbem/wscim/1/amt-schema/1/AMT_RedirectionService \
-h $HOST -P 16992 -u admin -p $AMT_PASSWORD \
-k RFBPassword=$AMT_PASSWORD
wsman put  http://intel.com/wbem/wscim/1/amt-schema/1/AMT_RedirectionService \
-h $HOST -P 16992 -u admin -p $AMT_PASSWORD \
-k ListenerEnabled=true

After that, amtterm works until the next power down of the device. At 
least that worked with a Fujitsu Q775 (Broadwell). I am not sure about 
the AMT version, but I think it was 10. For wsman, I used the Ubuntu 
packages.

Also, one needs either amtterm 1.4 or the appropriate patches 
backported to 1.3, otherwise amtterm will disconnect at every reboot 
of the machine.

A friend of mine also did some script using curl that does 
powerup/reset/powerdown via the port 16992 AMT web interface and works 
without wsman. If there is interest, I could probably post it here.


Cheers,
Stefan



Re: Possible fix for i217 problem

2015-08-05 Thread Stefan Fritsch
Thanks to everyone for the testing. The patch is now committed.



Possible fix for i217 problem

2015-08-04 Thread Stefan Fritsch
Hi,

someone mentioned to me the i217-LM problems that were reported on misc 
end of May. It is possible that the patch below helps.

For us, it fixed a problem on a laptop with i217-LM (pci id 8086:153a) 
where the receiving of packets would stop until the battery of the laptop 
was removed (or until linux or freebsd were booted, which also have this 
workaround). A normal reboot or power-cycle without removing the battery 
did not help. Interestingly, not even the Intel PXE BIOS has the 
workaround.

The problem would happen if the LAN cable was plugged in after the card 
had already been initialized. If the LAN cable was always plugged in when 
the laptop was powered on, the problem would not appear.

The workaround is part of the e1000_lv_jumbo_workaround_ich8lan() function 
in e1000_ich8lan.c in freebsd, but only the part that is used if jumbo 
packets are *not* configured. Linux has the same fix as 
b20a774495671f037e7160ea2ce87 and da1e2046e5f5ab268e55d30d6b74099ade0aeb6f 
with some more info in the commit messages.

This probably has quite some potential to cause regressions with other 
boards, so i am not sure if it should go in before 5.8 release.

Cheers,
Stefan


--- a/sys/dev/pci/if_em_hw.c
+++ b/sys/dev/pci/if_em_hw.c
@@ -91,6 +91,7 @@ static int32_tem_id_led_init(struct em_hw *);
 static int32_t em_init_lcd_from_nvm_config_region(struct em_hw *,  uint32_t,
uint32_t);
 static int32_t em_init_lcd_from_nvm(struct em_hw *);
+static int32_t em_phy_no_cable_workaround(struct em_hw *);
 static voidem_init_rx_addrs(struct em_hw *);
 static voidem_initialize_hardware_bits(struct em_hw *);
 static boolean_t em_is_onboard_nvm_eeprom(struct em_hw *);
@@ -7018,6 +7019,96 @@ em_read_mac_addr(struct em_hw *hw)
 }
 
 /**
+ * Explicitly disables jumbo frames and resets some PHY registers back to hw-
+ * defaults. This is necessary in case the ethernet cable was inserted AFTER
+ * the firmware initialized the PHY. Otherwise it is left in a state where
+ * it is possible to transmit but not receive packets. Observed on I217-LM and
+ * fixed in FreeBSD's sys/dev/e1000/e1000_ich8lan.c.
+ *
+ * hw - Struct containing variables accessed by shared code
+ */
+STATIC int32_t
+em_phy_no_cable_workaround(struct em_hw *hw) {
+   int32_t ret_val, dft_ret_val;
+   uint32_t mac_reg;
+   uint16_t data, phy_reg;
+
+   /* disable Rx path while enabling workaround */
+   em_read_phy_reg(hw, I2_DFT_CTRL, phy_reg);
+   ret_val = em_write_phy_reg(hw, I2_DFT_CTRL, phy_reg | (1  14));
+   if (ret_val)
+   return ret_val;
+
+   /* Write MAC register values back to h/w defaults */
+   mac_reg = E1000_READ_REG(hw, FFLT_DBG);
+   mac_reg = ~(0xF  14);
+   E1000_WRITE_REG(hw, FFLT_DBG, mac_reg);
+
+   mac_reg = E1000_READ_REG(hw, RCTL);
+   mac_reg = ~E1000_RCTL_SECRC;
+   E1000_WRITE_REG(hw, RCTL, mac_reg);
+
+   ret_val = em_read_kmrn_reg(hw, E1000_KUMCTRLSTA_OFFSET_CTRL, data);
+   if (ret_val)
+   goto out;
+   ret_val = em_write_kmrn_reg(hw, E1000_KUMCTRLSTA_OFFSET_CTRL,
+   data  ~(1  0));
+   if (ret_val)
+   goto out;
+
+   ret_val = em_read_kmrn_reg(hw, E1000_KUMCTRLSTA_OFFSET_HD_CTRL, data);
+   if (ret_val)
+   goto out;
+
+   data = ~(0xF  8);
+   data |= (0xB  8);
+   ret_val = em_write_kmrn_reg(hw, E1000_KUMCTRLSTA_OFFSET_HD_CTRL, data);
+   if (ret_val)
+   goto out;
+
+   /* Write PHY register values back to h/w defaults */
+   em_read_phy_reg(hw, I2_SMBUS_CTRL, data);
+   data = ~(0x7F  5);
+   ret_val = em_write_phy_reg(hw, I2_SMBUS_CTRL, data);
+   if (ret_val)
+   goto out;
+
+   em_read_phy_reg(hw, I2_MODE_CTRL, data);
+   data |= (1  13);
+   ret_val = em_write_phy_reg(hw, I2_MODE_CTRL, data);
+   if (ret_val)
+   goto out;
+
+   /*
+* 776.20 and 776.23 are not documented in
+* i217-ethernet-controller-datasheet.pdf...
+*/
+   em_read_phy_reg(hw, PHY_REG(776, 20), data);
+   data = ~(0x3FF  2);
+   data |= (0x8  2);
+   ret_val = em_write_phy_reg(hw, PHY_REG(776, 20), data);
+   if (ret_val)
+   goto out;
+
+   ret_val = em_write_phy_reg(hw, PHY_REG(776, 23), 0x7E00);
+   if (ret_val)
+   goto out;
+
+   em_read_phy_reg(hw, I2_PCIE_POWER_CTRL, data);
+   ret_val = em_write_phy_reg(hw, I2_PCIE_POWER_CTRL, data  ~(1  10));
+   if (ret_val)
+   goto out;
+
+out:
+   /* re-enable Rx path after enabling workaround */
+   dft_ret_val = em_write_phy_reg(hw, I2_DFT_CTRL, phy_reg  ~(1  14));
+   if (ret_val)
+   return ret_val;
+   else
+   return dft_ret_val;
+}
+