Re: Snapshots in QEMU

2018-07-31 Thread Mike Larkin
On Tue, Jul 31, 2018 at 04:43:22PM -0700, Mike Larkin wrote:
> On Tue, Jul 31, 2018 at 04:59:54PM -0300, Elias M. Mariani wrote:
> > And segmentation fault after applying the patches on OPENBSD_63.
> > Maybe is just a coincidence, according to the data:
> > HOST: ubuntu
> > QEMU: Running OPENBSD_63 + patches
> > (was working OK until patched).
> > 
> > Cheers.
> > Elias.
> > 
> > 2018-07-31 16:36 GMT-03:00 Elias M. Mariani :
> > > Trying to boot in QEMU from snapshots/amd64/cd63.iso or from
> > > snapshots/amd64/bsd.rd from the current disk:
> > > The booting starts but the machine gets rebooted almost immediately.
> > >
> > > I'm reporting this pretty bad. But is because I'm using 6.3 in that
> > > cloud server and just wanted to test if the snapshot was booting
> > > correctly. QEMU is running in godsknowswhat.
> > > I have never used QEMU myself, could someone try to reproduce ?
> > > using cd63.iso from OPENBSD_63 gives no problems.
> > >
> > > Cheers.
> > > Elias.
> 
> 
> This may be related to the recent speculation/lfence fix that went in
> a week or so ago. It reads an MSR that should be present on that CPU (and
> if it isn't, we won't read it).
> 
> I have one of those same Opterons here, I'll update it to -current and see
> if I can repro this on real hardware. My guess is that kvm is failing the
> RDMSR because the knowledge of it post-dates the time that kvm was built.
> 
> -ml
> 

PS, "show registers" at ddb> prompt would confirm that is indeed this fix,
if you could do that and report the output it would be appreciated.



Re: Snapshots in QEMU

2018-07-31 Thread Mike Larkin
On Tue, Jul 31, 2018 at 04:59:54PM -0300, Elias M. Mariani wrote:
> And segmentation fault after applying the patches on OPENBSD_63.
> Maybe is just a coincidence, according to the data:
> HOST: ubuntu
> QEMU: Running OPENBSD_63 + patches
> (was working OK until patched).
> 
> Cheers.
> Elias.
> 
> 2018-07-31 16:36 GMT-03:00 Elias M. Mariani :
> > Trying to boot in QEMU from snapshots/amd64/cd63.iso or from
> > snapshots/amd64/bsd.rd from the current disk:
> > The booting starts but the machine gets rebooted almost immediately.
> >
> > I'm reporting this pretty bad. But is because I'm using 6.3 in that
> > cloud server and just wanted to test if the snapshot was booting
> > correctly. QEMU is running in godsknowswhat.
> > I have never used QEMU myself, could someone try to reproduce ?
> > using cd63.iso from OPENBSD_63 gives no problems.
> >
> > Cheers.
> > Elias.


This may be related to the recent speculation/lfence fix that went in
a week or so ago. It reads an MSR that should be present on that CPU (and
if it isn't, we won't read it).

I have one of those same Opterons here, I'll update it to -current and see
if I can repro this on real hardware. My guess is that kvm is failing the
RDMSR because the knowledge of it post-dates the time that kvm was built.

-ml



Re: slowcgi -u user option does not change socket ownership

2018-07-31 Thread Andrew Daugherity
On Sun, Jul 29, 2018 at 11:07 AM, Florian Obser  wrote:
> It is behaving as intended. The slowcgi.sock is for the webserver to
> interact with. The specified user is not supposed to interact with the
> socket. CGI scripts are executed as this user.
>
> slowcgi itself can use the socket just fine since it already has a
> filedescriptor open.
>
> What problem are you trying to solve?

I ported slowcgi to Linux [1], (primarily) for use with nginx, since
the commonly recommended alternative 'fcgiwrap' seems possibly
unmaintained, and is a bit heavyweight in comparison.

openSUSE gives nginx its own user, separate from the wwwrun user used
by Apache etc.  I figured making wwwrun the compile-time default and
using '-u nginx' when needed would suffice, but it didn't, as nginx
was unable to access the socket.

Running it as 'andrew' in this bug report was just a verification that
this also occurs on OpenBSD, and wasn't a porting issue.  It seemed
like setting the user should also set the socket owner, and appeared
that the socket was just created too "early" (since the chroot etc. is
done after setting the user).  Your explanation makes sense; I
honestly never considered that the -u option was *not* supposed to
also set the socket ownership.

Obviously I could chown the socket after startup, or add yet another
option for socket ownership, but this seemed like a cleaner fix.

Related: in the same section of code (at the end of my diff actually,
as context), I noticed that when -u is used, the chroot path is set to
the target user's home directory instead of /var/www.  I found this
surprising, so I added a manpage diff to my patchset:

--- slowcgi.8 2017-10-17 17:47:58.0 -0500
+++ slowcgi.8 2018-07-26 13:34:06.459779115 -0500
@@ -78,7 +78,9 @@
 .It Fl u Ar user
 Drop privileges to
 .Ar user
-instead of default user www.
+instead of the default www, and chroot to that user's home directory,
+unless you specify otherwise with
+.Ar -p .
 .El
 .Sh SEE ALSO
 .Xr httpd 8

Perhaps that's a bit too wordy and only the first line is needed, I dunno.

Thanks for the software, it works great for me so far! (At least for
running Nagios...)


-Andrew

[1] https://github.com/adaugherity/slowcgi-portable
Not that hard to port, thanks to libbsd.  The only thing missing was
getdtablecount() and of course pledge().



>> >Fix:
>> Moving the slowcgi_listen() call to after the pw struct is set to 
>> slowcgi_user
>> fixes it:
>> 
>> --- usr.sbin/slowcgi/slowcgi.c  2018-07-25 20:46:56.358667880 -0500
>> +++ usr.sbin/slowcgi/slowcgi.c  2018-07-26 15:14:52.840052633 -0500
>> @@ -330,13 +330,13 @@
>>   if (pw == NULL)
>>   lerrx(1, "no %s user", SLOWCGI_USER);
>>
>> - fd = slowcgi_listen(fcgi_socket, pw);
>> -
>>   lwarnx("slowcgi_user: %s", slowcgi_user);
>>   pw = getpwnam(slowcgi_user);
>>   if (pw == NULL)
>>   lerrx(1, "no %s user", slowcgi_user);
>>
>> + fd = slowcgi_listen(fcgi_socket, pw);
>> +
>>   if (chrootpath == NULL)
>>   chrootpath = pw->pw_dir;
>> 



Re: axen Ethernet device errors on both USB3.0 and USB2.0 ports

2018-07-31 Thread sc dying
On Mon, Jul 30, 2018 at 12:11 AM,   wrote:
>  - Use DMA buffer aligned at 64KB boundary to avoid xhci bug.

In function xhci_event_xfer() xfer->actlen should be calculated
from sum of transferred TRB length, not xfer->length - remain,
when the transfer is splitted to multiple TRBs.

The xhci driver may split a transfer into multiple TRBs if the DMA
buffer is larger than 64kB or crosses 64kB boundary.
For the example of unpatched axen, 24kB buffer of SS Bulk-In transfer
might be spliited like following.

 TRB #0 bulk-in len 0x1000 paddr 0xda3ff000 (CHAIN)
 TRB #1 bulk-in len 0x5000 paddr 0xda40 (IOC)

On the completion of each TRB xhci_event_xfer() calculates actlen.
The size of inbound packet is usually less than given buffer size,
TRB #0 ends up with SHORT_XFER and TRB #1 ends up with SUCCESS.

 bulk-in idx 2 last 3 len 0x6000 remain 0xf98 (for TRB #0)
 bulk-in idx 3 last 3 len 0x6000 remain 0x5000 (for TRB #1)

For TRB #0, the actlen is xfer->length - remain = 0x6000 - 0xf98 = 0x5068.
This value is stored in xfer->actlen.
The actlen of #1 is 0x6000 - 0x5000 = 0x1000, but xfer->actlen is not zero
so the actlen of #1 is ignored.
Thus the actlen of this transfer is determined to be 0x5068, however,
it is actually 0x68.

When axen works on USB2 the unpatched axen uses 16kB buffer.
It's smaller than 24kB of SS, less possible the buffer crosses 64kB boundary,
you might not meet the problem.



Re: double fault trap, code=0

2018-07-31 Thread Giovanni Bechis
On Tue, Jul 31, 2018 at 02:04:53PM -0700, Philip Guenther wrote:
> On Tue, 31 Jul 2018, giova...@paclan.it wrote:
> > >Synopsis:  Every now and then I hit ddb with double fault trap, code=0
> > >Category:  acpi
> > >Environment:
> > System  : OpenBSD 6.3
> > Details : OpenBSD 6.3-current (GENERIC) #143: Fri Jul 27 04:38:01 
> > MDT 2018
> >  
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> > >Description:
> > Every couple of days I hit ddb:
> > double fault trap, code=0
> > Stopped at __mtx_enter+0xf: pushq %r11
> > ddb{0}> bt
> > __mtx_enter(0) at __mtx_enter+0xf
> > 
> > i915_get_crtc_scanoutpos(f69da68b441aff13,80169156,8015f800,8015f800,1,0)
> >  at i915_get_crtc_scanoutpos+0xce
> > 
> > drm_calc_vbltimestamp_from_scanoutpos(6551790fe8f1e00a,0,8015f800,0,8015f800,453d)
> >  at drm_calc_vbltimestamp_from_scanoutpos+0x92
> > drm_update_vblank_count() at drm_update_vblank_count+0x9b
> > drm_handle_vblank() at drm_handle_vblank+0xd1
> > ironlake_irq_handler(575039e4693defc4,8015d700) at 
> > ironlake_irq_handler+0x320
> > intr_handler(84d668fbeb1151573,0) at intr_handler+0x68
> > Xintr_ioapic_edge16_untramp(0,0,1,0,81b329e8,ff012cdc33f0) 
> > at Xintr_ioapic_edge16_untramp+0x19f
> > uvm_map_addr_RBT_AUGMENT(1aa311ba321d33a4) at uvm_map_addr_RBT_AUGMENT
> > uvm_mapent_addr_remove(81c9ba58,800032d2) at 
> > uvm_mapent_addr_remove+0x67
> > 
> > uvm_mapent_mkfree(709092b46cd75947,800032d2,ff012cdc33f0,81c9ba58,ff012cdc3000)
> >  at uvm_mapent_mkfree+0xc9
> > 
> > uvm_unmap_remove(da20d5b12b851735,800032d2000,81c9ba58,800032cb2540,800032d1f000,1)
> >  at uvm_unmap_remove+0x2cf
> > uvm_unmap(709092b46c9e3347,800032d1f000,800032d2) at 
> > uvm_unmap+0x75
> > km_free(4033f06f727571cc,514,0,1000) at km_free+0x4f
> > _bus_space_unmap(ddf1a0675f9a0f6,1,0,81b7aaf8) at 
> > _bus_space_unmap+0xdd
> > acpi_gasio(ad65595461093493,0,0,809293a0,800032cb2768,1) at 
> > acpi_gasio+0x242
> > 
> > aml_opreg_sysmem_handler(14f676a0776defdb,800032cb2748,818347d0,800032cb26d0,ad65595461093493)
> >  at aml_opreg_sysmem_handler+0x30
> > 
> > aml_rwgen(221071f83d28047,80929388,80062088,28a2,8048188,1)
> >  at aml_rwgen+0x650
> > aml_rwfield(771ec935baef5db0,8075f308,69,69,80062088) 
> > at aml_rwfield+0x3a5
> > 
> > aml_eval(79dedd78a5dfabfb,8075f308,8035031,69,80062088)
> >  at aml_eval+0x1f7
> > aml_parse(de6bf302f7589bb6,8075f308,80035021) at 
> > aml_parse+0x54
> > three more pages of the last line
> > aml_eval(e4d65b8caee09c80,0,80089408,2,0) at aml_eval+0x323
> > aml_evalnode(bcb11c8975c0ae9,80026400,80026400,2,0) at 
> > aml_evalnode+0xae
> > acpi_gpe(2c80ceef08cd0301,80026400,8002bc40) at 
> > acpi_gpe+0x35
> > acpi_thread(0) at acpi_thread+0x188
> > end trace frame: 0x0, count: -65
> 
> And quoting a previous off-list email:
> > every now and then, starting from at least a month ago my laptop
> > enters ddb with "Double fault trap, code=0".
> > Most of the times it is in ieee80211 and at a first glance I
> > looked at iwm(4), but today it happened also with intel(4).
> 
> So it's double-faulting because it's running off the end of the kernel 
> stack for the ACPI thread due to a combination of deeply nested AML and 
> stack usage by the DRM and/or 802.11 interrupt handlers.
> 
> I don't see any recent changes in the ACPI stack which would cause a 
> change in behavior on this box (it doesn't have GenericSerialBus, or _DSD 
> properties, or an sdhc device), so either
>  a) the thange in stack consumption is from the DRM and 802.11 side, OR
>  b) did you update the BIOS around the time this started?
>   > bios0: vendor TOSHIBA version "Version 5.10" date 04/18/2018
> Perhaps the new version uses more deeply nested AML.
> 
I do not have enough dmesg log files but it could be related to a bios update 
I had completely forgot that

> 
> For those wondering about the iwm/802.11 case, the photo previously sent 
> had the trace of the interrupt fame going, from bottom up:
> 
> -> Xintr_ioapic_edge24_untramp
> -> intr_handler
> -> iwm_intr
> -> iwm_rx_pkt
> -> iwm_rx_mpdu
> -> iwm_rx_frame
> -> ieee80211_input
> -> ieee80211_recv_probe_resp
> -> ieee80211_find_node_for_beacon
> 
> Are any of those using more stack-space than before?
> 
> 
> Not sure what we want to do here.
>  - if this did start after updating the BIOS, see if there's a newer one 
>or maybe downgrade
There isn't an update available and a downgrade seems not possible

>  - if we can identify an increase in stack use in an interrupt path, we 
>should fix that
>  - making 

Re: double fault trap, code=0

2018-07-31 Thread Philip Guenther
On Tue, 31 Jul 2018, giova...@paclan.it wrote:
> >Synopsis:Every now and then I hit ddb with double fault trap, code=0
> >Category:acpi
> >Environment:
>   System  : OpenBSD 6.3
>   Details : OpenBSD 6.3-current (GENERIC) #143: Fri Jul 27 04:38:01 
> MDT 2018
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> >Description:
>   Every couple of days I hit ddb:
>   double fault trap, code=0
>   Stopped at __mtx_enter+0xf: pushq %r11
>   ddb{0}> bt
>   __mtx_enter(0) at __mtx_enter+0xf
>   
> i915_get_crtc_scanoutpos(f69da68b441aff13,80169156,8015f800,8015f800,1,0)
>  at i915_get_crtc_scanoutpos+0xce
>   
> drm_calc_vbltimestamp_from_scanoutpos(6551790fe8f1e00a,0,8015f800,0,8015f800,453d)
>  at drm_calc_vbltimestamp_from_scanoutpos+0x92
>   drm_update_vblank_count() at drm_update_vblank_count+0x9b
>   drm_handle_vblank() at drm_handle_vblank+0xd1
>   ironlake_irq_handler(575039e4693defc4,8015d700) at 
> ironlake_irq_handler+0x320
>   intr_handler(84d668fbeb1151573,0) at intr_handler+0x68
>   Xintr_ioapic_edge16_untramp(0,0,1,0,81b329e8,ff012cdc33f0) 
> at Xintr_ioapic_edge16_untramp+0x19f
>   uvm_map_addr_RBT_AUGMENT(1aa311ba321d33a4) at uvm_map_addr_RBT_AUGMENT
>   uvm_mapent_addr_remove(81c9ba58,800032d2) at 
> uvm_mapent_addr_remove+0x67
>   
> uvm_mapent_mkfree(709092b46cd75947,800032d2,ff012cdc33f0,81c9ba58,ff012cdc3000)
>  at uvm_mapent_mkfree+0xc9
>   
> uvm_unmap_remove(da20d5b12b851735,800032d2000,81c9ba58,800032cb2540,800032d1f000,1)
>  at uvm_unmap_remove+0x2cf
>   uvm_unmap(709092b46c9e3347,800032d1f000,800032d2) at 
> uvm_unmap+0x75
>   km_free(4033f06f727571cc,514,0,1000) at km_free+0x4f
>   _bus_space_unmap(ddf1a0675f9a0f6,1,0,81b7aaf8) at 
> _bus_space_unmap+0xdd
>   acpi_gasio(ad65595461093493,0,0,809293a0,800032cb2768,1) at 
> acpi_gasio+0x242
>   
> aml_opreg_sysmem_handler(14f676a0776defdb,800032cb2748,818347d0,800032cb26d0,ad65595461093493)
>  at aml_opreg_sysmem_handler+0x30
>   
> aml_rwgen(221071f83d28047,80929388,80062088,28a2,8048188,1)
>  at aml_rwgen+0x650
>   aml_rwfield(771ec935baef5db0,8075f308,69,69,80062088) 
> at aml_rwfield+0x3a5
>   
> aml_eval(79dedd78a5dfabfb,8075f308,8035031,69,80062088)
>  at aml_eval+0x1f7
>   aml_parse(de6bf302f7589bb6,8075f308,80035021) at 
> aml_parse+0x54
>   three more pages of the last line
>   aml_eval(e4d65b8caee09c80,0,80089408,2,0) at aml_eval+0x323
>   aml_evalnode(bcb11c8975c0ae9,80026400,80026400,2,0) at 
> aml_evalnode+0xae
>   acpi_gpe(2c80ceef08cd0301,80026400,8002bc40) at 
> acpi_gpe+0x35
>   acpi_thread(0) at acpi_thread+0x188
>   end trace frame: 0x0, count: -65

And quoting a previous off-list email:
> every now and then, starting from at least a month ago my laptop
> enters ddb with "Double fault trap, code=0".
> Most of the times it is in ieee80211 and at a first glance I
> looked at iwm(4), but today it happened also with intel(4).

So it's double-faulting because it's running off the end of the kernel 
stack for the ACPI thread due to a combination of deeply nested AML and 
stack usage by the DRM and/or 802.11 interrupt handlers.

I don't see any recent changes in the ACPI stack which would cause a 
change in behavior on this box (it doesn't have GenericSerialBus, or _DSD 
properties, or an sdhc device), so either
 a) the thange in stack consumption is from the DRM and 802.11 side, OR
 b) did you update the BIOS around the time this started?
> bios0: vendor TOSHIBA version "Version 5.10" date 04/18/2018
Perhaps the new version uses more deeply nested AML.


For those wondering about the iwm/802.11 case, the photo previously sent 
had the trace of the interrupt fame going, from bottom up:

-> Xintr_ioapic_edge24_untramp
-> intr_handler
-> iwm_intr
-> iwm_rx_pkt
-> iwm_rx_mpdu
-> iwm_rx_frame
-> ieee80211_input
-> ieee80211_recv_probe_resp
-> ieee80211_find_node_for_beacon

Are any of those using more stack-space than before?


Not sure what we want to do here.
 - if this did start after updating the BIOS, see if there's a newer one 
   or maybe downgrade
 - if we can identify an increase in stack use in an interrupt path, we 
   should fix that
 - making aml_parse() iterative instead of recursive...by tracking frames 
   of AML state in an explict stack...would be annoying, more complex to 
   maintain, and probably inefficient.  Maybe it's time to let kernel 
   threads request a larger than default stack size and have acpi_thread 
   request another page or so?
 - if all else fails, there's always increasing UPAGES...  


Philip 

Snapshots in QEMU

2018-07-31 Thread Elias M. Mariani
Trying to boot in QEMU from snapshots/amd64/cd63.iso or from
snapshots/amd64/bsd.rd from the current disk:
The booting starts but the machine gets rebooted almost immediately.

I'm reporting this pretty bad. But is because I'm using 6.3 in that
cloud server and just wanted to test if the snapshot was booting
correctly. QEMU is running in godsknowswhat.
I have never used QEMU myself, could someone try to reproduce ?
using cd63.iso from OPENBSD_63 gives no problems.

Cheers.
Elias.



Re: Kernel Panic 6.3 and HP DL360 Gen9

2018-07-31 Thread Albert Martinez
Hi,

sorry for the delay, we had to move the testings to a another server and 
set everything again, this its an HP DL360 G6, but its happening just 
the same..

Find attached the standard outputs from: 
https://www.openbsd.org/ddb.html + the ones that you requested .. this 
time was all done in a fresh install from the last 6.3 current.

Regards,
Albert Martinez

On 05/07/18 10:36, Albert Martinez wrote:
> Hi,
>
> Ok, then I will perform this week the upgrade to the last current.
>
> Regards,
> Albert
>
> The outputs from  "sh malloc" and the  "sh all pools" from the 6.3 
> stable:
>
> panic: malloc: out of space in kmem_map
> Stopped at  db_enter+0x5:   popq    %rbp
>     TID    PID    UID PRFLAGS PFLAGS  CPU  COMMAND
> * 52231  1  0    0x13  0    2K ksh
> db_enter() at db_enter+0x5
> panic() at panic+0x129
> malloc(800020bc1570,800020bc1510,1) at malloc+0x6fa
> ufs_readdir(1) at ufs_readdir+0x101
> VOP_READDIR(800020a8c018,3b0f315201378d58,800020bc155c,ff087f7ca480
>  
>
> ) at VOP_READDIR+0x45
> sys_getdents(630,800020a8c018,0) at sys_getdents+0x125
> syscall() at syscall+0x279
> --- syscall (number 99) ---
> end of kernel
> end trace frame: 0x7f7c5090, count: 8
> 0x13f92454032a:
> https://www.openbsd.org/ddb.html describes the minimum info required 
> in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{2}> sh malloc
>    Type InUse  MemUse  HighUse   Limit  Requests Type Lim Kern 
> Lim
>  devbuf  8699   5339K    5340K  78643K 10670 0 0
>     pcb    72 15K  15K  78643K   202 0 0
>  rtable  1231 26K  26K  78643K  1783 0 0
>  ifaddr   632 96K  96K  78643K   639 0 0
>    counters   455    398K 398K  78643K   455 0 0
>    ioctlops 1  4K   4K  78643K 92144 0 0
>     iov 0  0K   1K  78643K 1 0 0
>   mount 4  4K   4K  78643K 4 0 0
>  vnodes  1161 73K  73K  78643K  1168 0 0
>   UFS quota 1 32K  32K  78643K 1 0 0
>   UFS mount    17 56K  56K  78643K    17 0 0
>     shm 2  1K   1K  78643K 2 0 0
>  VM map 2  0K   0K  78643K 2 0 0
>     sem 2  0K   0K  78643K 2 0 0
>     dirhash    24  4K   4K  78643K    57 0 0
>    ACPI  5017    559K 671K  78643K 17335 0 0
>   file desc 1  0K   0K  78643K 1 0 0
>    proc    14 13K  13K  78643K    14 0 0
>     NFS srvsock 1  0K   0K  78643K 1 0 0
>  NFS daemon 1 16K  16K  78643K 1 0 0
>     ip_moptions    70  8K   8K  78643K    70 0 0
>    in_multi   411 27K  27K  78643K   412 0 0
>     ether_multi   414 12K  12K  78643K   415 0 0
>     ISOFS mount 1 32K  32K  78643K 1 0 0
>   MSDOSFS mount 1 16K  16K  78643K 1 0 0
>    ttys   420   1777K    1777K  78643K   420 0 0
>    exec 0  0K   1K  78643K  1273 0 0
>     pagedep 1  8K   8K  78643K 1 0 0
>    inodedep 1 32K  32K  78643K 1 0 0
>  newblk 1  0K   0K  78643K 1 0 0
>     VM swap 7    722K 722K  78643K 7 0 0
>    UVM amap   234  9K  26K  78643K  4750 0 0
>    UVM aobj 2  2K   2K  78643K 2 0 0
>     USB    99 44K  44K  78643K   121 0 0
>  USB device    32  2K   2K  78643K    32 0 0
>  USB HC 1  0K   0K  78643K 1 0 0
>     memdesc 1  4K   4K  78643K 1 0 0
>     crypto data 1  1K   1K  78643K 1 0 0
>     NDP   143  4K   5K  78643K   157 0 0
>    temp  1025   2620K    2624K  78643K 12704 0 0
>   SYN cache 2 16K  16K  78643K 2 0 0
> ddb{2}> sh all pools
> Name  Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg 
> Maxpg Idle
> arp 56  145    0    0 3 0 3 3 0 8    0
> pfsync  72    2    0    2 1 1 0 1 0 8    0
> inpcbpl    280    13877    0    13868 1 0 1 1 0 8    0
> plimitpl   152   31    0   11 1 0 1 1 0 8    0
> plcache    128  240    0    0 8 0 8 8 0 8    0
> rtentry    112  605    0    2    18 0    18    18 0 8    0
> tcpcb  544    5    0    0 1 0 1 1 0 8    0
> nd6 48   30    0    0 1 0 1 1 0 8    0
> pfosfp  40 1269    0  846 5 0 5 5 0 8    0
> pfosfpen   112 2142    0 1428    21 0    21    21 0 8    0
> pfrktable  1344   3    0    1 1 0 

Re: axen Ethernet device errors on both USB3.0 and USB2.0 ports

2018-07-31 Thread Denis
Hello,

Thank you for patch. I can apply it a bit later in the trip currently.



On 7/30/2018 3:11 AM, sc.dy...@gmail.com wrote:
> Hi,
> 
> On 2018/07/27 09:14, Denis wrote:
>> Every time (after 2-3 minutes of work) ASIX Electronics USB-Ethernet
>> device reports:
>>
>> axen0: usb errors on rx: IOERROR
>> axen0: usb errors on rx: IOERROR
>> axen0: usb errors on tx: IOERROR
>> axen0: watchdog timeout
>> axen0: usb errors on tx: IOERROR
>>
>> The device hangs and must be reattached to have it working again for 2-3
>> minutes.
> 
> Do you want to try this patch?
> 
> -
>  - header: fix comments
>  - header: fix unused L3 type mask definition
>  - rxeof: Avoid allocating mbuf if checksum errors are detected.
>  - rxeof: Avoid loop to extract packets if pkt_count is 0.
>  - rxeof: Add more sanity checks.
>  - rxeof: Increament if_ierror in some error paths.
>  - qctrl: Apply queuing control parameters from FreeBSD axge(4).
>  - qctrl: Set qctrl in miireg_statchg dynamically, not statically.
>  - Use DMA buffer aligned at 64KB boundary to avoid xhci bug.
> 
> 
> --- sys/dev/usb/if_axenreg.h    Fri Sep 16 22:17:07 2016
> +++ sys/dev/usb/if_axenreg.h    Mon Jun 19 10:54:28 2017
> @@ -26,8 +26,8 @@
>   * |    | ++-L3_type (1:ipv4, 0/2:ipv6)
>   *    pkt_len(13)  |    | ||+ ++-L4_type(0: icmp, 1: UDP, 4: TCP)
>   * |765|43210 76543210|7654 3210 7654 3210|
> - *  ||+-crc_err  |+-L4_err |+-L4_CSUM_ERR
> - *  |+-mii_err   +--L3_err +--L3_CSUM_ERR
> + *  ||+-crc_err   |+-L4_err |+-L4_CSUM_ERR
> + *  |+-mii_err    +--L3_err +--L3_CSUM_ERR
>   *  +-drop_err
>   *
>   * ex) pkt_hdr 0x00680820
> @@ -70,7 +70,7 @@
>  #define   AXEN_RXHDR_L4_TYPE_TCP    0x4
>  
>  /* L3 packet type (2bit) */
> -#define AXEN_RXHDR_L3_TYPE_MASK    0x0600
> +#define AXEN_RXHDR_L3_TYPE_MASK    0x0060
>  #define AXEN_RXHDR_L3_TYPE_OFFSET    5
>  #define   AXEN_RXHDR_L3_TYPE_UNDEF    0x0
>  #define   AXEN_RXHDR_L3_TYPE_IPV4    0x1
> --- sys/dev/usb/if_axen.c.orig    Tue Jun 12 15:36:59 2018
> +++ sys/dev/usb/if_axen.c    Sun Jul 29 01:53:43 2018
> @@ -53,6 +53,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -121,6 +122,13 @@ void    axen_unlock_mii(struct axen_softc *sc);
>  
>  void    axen_ax88179_init(struct axen_softc *);
>  
> +struct axen_qctrl axen_bulk_size[] = {
> +    { 7, 0x4f, 0x00, 0x12, 0xff },
> +    { 7, 0x20, 0x03, 0x16, 0xff },
> +    { 7, 0xae, 0x07, 0x18, 0xff },
> +    { 7, 0xcc, 0x4c, 0x18, 0x08 }
> +};
> +
>  /* Get exclusive access to the MII registers */
>  void
>  axen_lock_mii(struct axen_softc *sc)
> @@ -238,6 +246,8 @@ axen_miibus_statchg(struct device *dev)
>  int    err;
>  uint16_t    val;
>  uWord    wval;
> +    uint8_t    linkstat = 0;
> +    int    qctrl;
>  
>  ifp = GET_IFP(sc);
>  if (mii == NULL || ifp == NULL ||
> @@ -265,27 +275,49 @@ axen_miibus_statchg(struct device *dev)
>  return;
>  
>  val = 0;
> -    if ((IFM_OPTIONS(mii->mii_media_active) & IFM_FDX) != 0)
> +    if ((IFM_OPTIONS(mii->mii_media_active) & IFM_FDX) != 0) {
>  val |= AXEN_MEDIUM_FDX;
> +    if ((IFM_OPTIONS(mii->mii_media_active) & IFM_ETH_TXPAUSE) != 0)
> +    val |= AXEN_MEDIUM_TXFLOW_CTRL_EN;
> +    if ((IFM_OPTIONS(mii->mii_media_active) & IFM_ETH_RXPAUSE) != 0)
> +    val |= AXEN_MEDIUM_RXFLOW_CTRL_EN;
> +    }
>  
> -    val |= (AXEN_MEDIUM_RECV_EN | AXEN_MEDIUM_ALWAYS_ONE);
> -    val |= (AXEN_MEDIUM_RXFLOW_CTRL_EN | AXEN_MEDIUM_TXFLOW_CTRL_EN);
> +    val |= AXEN_MEDIUM_RECV_EN;
>  
> +    /* bulkin queue setting */
> +    axen_lock_mii(sc);
> +    axen_cmd(sc, AXEN_CMD_MAC_READ, 1, AXEN_USB_UPLINK, );
> +    axen_unlock_mii(sc);
> +
>  switch (IFM_SUBTYPE(mii->mii_media_active)) {
>  case IFM_1000_T:
>  val |= AXEN_MEDIUM_GIGA | AXEN_MEDIUM_EN_125MHZ;
> +    if (linkstat & AXEN_USB_SS)
> +    qctrl = 0;
> +    else if (linkstat & AXEN_USB_HS)
> +    qctrl = 1;
> +    else
> +    qctrl = 3;
>  break;
>  case IFM_100_TX:
>  val |= AXEN_MEDIUM_PS;
> +    if (linkstat & (AXEN_USB_SS | AXEN_USB_HS))
> +    qctrl = 2;
> +    else
> +    qctrl = 3;
>  break;
>  case IFM_10_T:
> -    /* doesn't need to be handled */
> +    default:
> +    qctrl = 3;
>  break;
>  }
>  
>  DPRINTF(("axen_miibus_statchg: val=0x%x\n", val));
>  USETW(wval, val);
>  axen_lock_mii(sc);
> +    axen_cmd(sc, AXEN_CMD_MAC_SET_RXSR, 5, AXEN_RX_BULKIN_QCTRL,
> +    _bulk_size[qctrl]);
>  err = axen_cmd(sc, AXEN_CMD_MAC_WRITE2, 2, AXEN_MEDIUM_STATUS, );
>  axen_unlock_mii(sc);
>  if (err) {
> @@ -408,7 +440,6 @@ axen_ax88179_init(struct axen_softc *sc)
>  uWord    wval;
>  uByte    val;
>  u_int16_t ctl, temp;
> -    struct axen_qctrl qctrl;
>  
>