Re: Recent amd64 auich panics

2017-01-23 Thread Martin Husemann
On Mon, Jan 23, 2017 at 08:33:34AM +, Chavdar Ivanov wrote:
> Hi,
> 
> The last few days I am getting repeated panics on amd64 -current running
> under VirtualBox (latest 5.1.14 version at the moment) as follows:
> ---
> panic: auidui_init_ringbuffer: blksize=0

There were some changes in this area yesterday which fixed a similar
problem for me, also this code probably needs fixing:

static int
auich_round_blocksize(void *v, int blk, int mode,
const audio_params_t *param)
{

return blk & ~0x3f; /* keep good alignment */
}

It should return 0x40 instead of 0 for too small block sizes.
Something like:

if (blk < 0x40)
return 0x40;/* avoid 0 block size */
return blk & ~0x3f; /* keep good alignment */

Martin


Recent amd64 auich panics

2017-01-23 Thread Chavdar Ivanov
Hi,

The last few days I am getting repeated panics on amd64 -current running
under VirtualBox (latest 5.1.14 version at the moment) as follows:
---
panic: auidui_init_ringbuffer: blksize=0
fatal breakpoint trap in supervisor mode
trap type 1 code  0 rip 80115455 cs 8 rflags 246 cr2 0 ilevel 8 rsp
fe8043d4bf80
curlwp 0xfe8043d54620 pid 0.45 lowest kstack 0xfe8043d492c0
Stopped in pid 0.45 (system) at netbsd:breakpoint+0x5: leave
db{2}> bt
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
snprintf() at netbsd:snprintf
audio_initbufs() at netbsd:audio_initbufs
audio_initbufs() at netbsd:audio_initbufs+0xb4
audiosetinfo() at netbsd:audiosetinfo+0x95e
audio_set_vchan_defaults.isra.14.constprop.16() at
netbsd:audio_set_vchan_defaults.isra.14.constprop.16+0x17b
audioattach() at netbsd:audioattach+0x68a
config_attach_loc() at netbsd:config_attach_loc+0x17a
config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
audio_attach_mi() at netbsd:audio_attach_mi+0x32
auich_finish_attach() at netbsd:auich_finish_attach+0x4e
config_interrupts_thread() at netbsd:config_interrupts_thread+0x30
db{2}>


If I disable auich, it boots fine.

The dmesg buffer is not saved and for some reason 'reboot 0x104' does not
produce a dump, so the above is manually taken.

Chavdar


Re: Starting NPF crashes amd64 -current (23 Jan 2017)

2017-01-23 Thread Geoff Wing
On Tuesday 2017-01-24 15:59 +1100, Geoff Wing output:
:starting -current on amd64, I get a crash during (presumably) /etc/rc.d/npf

Panics from previous message were when I had
pseudo-device   npf
in my kernel config.  Removing that I get panics at the same
place (npfctl) as
mutex_vector_error: locking against myself

address:  ..f81327488
cpu:0
lwp:  blah...5420
field:blah...5420   wait/spin: 0/0

Stopped in pid 206.1 (npfctl) at netbsd:breakpoint+0x5: leave

Regards,
Geoff


Starting NPF crashes amd64 -current (23 Jan 2017)

2017-01-23 Thread Geoff Wing
Hi,
starting -current on amd64, I get a crash during (presumably) /etc/rc.d/npf

I have some dynamic tables in /etc/npf.conf, e.g.
table  type tree dynamic
though maybe not relevant.

Panics are copied from phone video.  I can't get a crash dump, nor
does my computer keep system message logs over reboot.

One panic had

panic: kernel diagnostic assertion "elements > 0" filed: sys/kern/subr_hash.c: 
line 93

vpanic()
ch_voltag_convert_in()
hashinit()+0x1b4
npf_table_create()+0xc7
pf_mk_tables.isra.0()+0x20f
npfctl_load()+0x1d3
VOP_IOCTL()
vn_ioctl()
sys_ioctl()
syscal()
-- syscall (54)
dumping to dev 168,10  not possible
rebooting


After several rebuilds and changing dump device, I get

db{0}> sync

dumping to dev 168,10 (offset=27281471, size=3143454)
dump uvm_fault(0xfe833bacd5c8, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6
...blah...
Stopped in pid 206.1 (npfctl) atnetbsd:sparse_dump_mark+0x10c: cmpq 
$0,38(%rax)


Re: OpenVPN causes fresh -current to crash

2017-01-23 Thread Ryota Ozaki
On Tue, Jan 24, 2017 at 12:53 AM, Tom Ivar Helbekkmo
 wrote:
> Ryota Ozaki  writes:
>
>> The latest pfil.c (v1.34) should fix the panic. Could you try it?
>
> I'll give it a go tonight, and report back.

Thanks.

>
> Meanwhile, do you think this ongoing MPSAFE work may have some unwanted
> consequences for NFS?  There's a problem that's been around for at least
> a couple of months, but that I only discovered the other day -- I was
> running with kernels from late October then, and the problem I observed
> is still there after upgrading.

I'm not sure. I don't know much about NFS, how it works and how it involves
the network stack.

>
> Reading NFS file systems is no problem, which is why I didn't notice it
> before, but writing hangs.  Here's an example: I started compiling a C
> source file directly to an executable on an NFS mounted file system
> (server and client both amd64 running fresh -current).  The compile pass
> is fine, but when the ld end of the pipeline wants to write the
> executable, it hangs.  So I try to do a 'df' in another terminal, and it
> hangs.  Finally, I simply attempt to make 'ls -l [target executable]'
> show me if it's written anything yet, and that hangs, too: after an
> attempt to write has hung the communication up, reads no longer work,
> either:
>
>  UID   PID  PPID   CPU PRI  NI VSZ RSS WCHAN   STAT TTY  TIME 
> COMMAND
>0 22179 22678 0 124   0   333445136 netio   D+   pts/170:00.01 
> ld [...]
>  501 21370 21006   516  85   089521144 nfsrcv  I+   pts/180:00.00 
> df
>  501 21710 1 0 127   089641116 tstile  Dpts/20-   0:00.00 
> /bin/ls [...]
>
> Once I have something with "tstile" in the "WCHAN" column, I know that
> I can't just reboot the machine: it's going to take a hard reset.

Can you get DDB? If you can, you can know where the processes hang up:
  db> ps # you can get LWP addresses of ld and ls
  db> bt/a  # you can get their stack traces

And I guess by ps you can see some other LWPs stuck on tstile, for example
softnet/N. Getting stack traces of such LWPs would explain how the hang
happens, at least, can be hints to investigate.

>
> Oh, and it's the client that hangs; the server seems to be just fine,
> and a reboot of the client makes NFS reads behave normally again.  On
> the server, the output file got created, but is zero bytes.  The error
> logged on the client when it gets stuck is this console output:
>
> nfs send error 64 for barsoom:/usr/local
>
> ...and then the normal "nfs server not responding" messages in syslog
> after that, of course.

I tried a NFS client with -current and a NFS server with netbsd-7, but
writing didn't hang (I compiled a C program and cp -r /etc/ /mnt/nfs).
The hang may happen depending on a NIC. Which NIC do you use?

And please let me know NFS options of the client and the server?

  ozaki-r


daily CVS update output

2017-01-23 Thread NetBSD source update

Updating src tree:
P src/external/bsd/dhcpcd/dist/dhcpcd-run-hooks.8.in
P src/external/bsd/dhcpcd/dist/dhcpcd.8.in
P src/external/bsd/dhcpcd/dist/dhcpcd.conf.5.in
P src/share/man/man9/disk.9
P src/share/man/man9/vnode.9
P src/sys/arch/amd64/conf/XEN3_DOM0
P src/sys/arch/amd64/conf/XEN3_DOMU
P src/sys/arch/i386/conf/XEN3_DOM0
P src/sys/arch/i386/conf/XEN3_DOMU
P src/sys/dev/ic/dwc_gmac.c
P src/sys/net/bpf.c
P src/sys/net/bpfdesc.h
P src/sys/net/if.c
P src/sys/net/if_bridge.c
P src/sys/net/if_tun.c
P src/sys/net/if_vlan.c
P src/sys/netinet/in.c
P src/sys/netinet/in_pcb.c
P src/sys/netinet6/in6.c
P src/sys/netinet6/in6_pcb.c
P src/usr.bin/mail/mail.1

Updating xsrc tree:


Killing core files:

Running the SUP scanner:
SUP Scan for current starting at Tue Jan 24 03:01:41 2017
SUP Scan for current completed at Tue Jan 24 03:02:02 2017
SUP Scan for mirror starting at Tue Jan 24 03:02:02 2017
SUP Scan for mirror completed at Tue Jan 24 03:04:49 2017




Updating file list:
-rw-rw-r--  1 srcmastr  netbsd  51480962 Jan 24 03:06 ls-lRA.gz


Re: current panics with urtwn0

2017-01-23 Thread Stefan Hertenberger
Spam detection software, running on the system "vbsd.hertenberger.bayern",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
postmaster for details.

Content preview:  Am Mon, 23 Jan 2017 20:42:32 + (UTC) schrieb 
chris...@astron.com
   (Christos Zoulas): > Looks like it should have already been fixed. > > 
christos
   > Sorry for the noise, after a cleaning the obj dir and rebuilding, the panic
   is gone. [...] 

Content analysis details:   (6.1 points, 5.0 required)

 pts rule name  description
 -- --
 0.0 URIBL_BLOCKED  ADMINISTRATOR NOTICE: The query to URIBL was 
blocked.
See

http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
 for more information.
[URIs: astron.com]
 3.6 RCVD_IN_PBLRBL: Received via a relay in Spamhaus PBL
[80.128.179.157 listed in zen.spamhaus.org]
 0.9 SPF_FAIL   SPF: sender does not match SPF record (fail)
[SPF failed: Please see 
http://www.openspf.org/Why?s=mfrom;id=stefan%40hertenberger.bayern;ip=80.128.179.157;r=vbsd.hertenberger.bayern]
 1.6 RCVD_IN_BRBL_LASTEXT   RBL: No description available.
[80.128.179.157 listed in bb.barracudacentral.org]


--- Begin Message ---
Am Mon, 23 Jan 2017 20:42:32 + (UTC)
schrieb chris...@astron.com (Christos Zoulas):

> Looks like it should have already been fixed.
> 
> christos
> 

Sorry for the noise, after a cleaning the obj dir and rebuilding, the
panic is gone.

thx Stefan
--- End Message ---


Re: current panics with urtwn0

2017-01-23 Thread Christos Zoulas
In article <20170123211552.562439cc@bsd64.localdomain>,
Stefan Hertenberger   wrote:
>
>Content preview:  hello, latest current panics when my usb wireless device is
>   attached. i made a screenshot of the stack trace =>
>https://www.alarum.de/netbsd/20170123_205754.jpg
>   urtwn0 at uhub4 port 4 urtwn0: Realtek 802.11n WLAN Adapter, rev 2.00/2.00,
>   addr 4 urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:88:7c:ad
>   urtwn0: 1 rx pipe, 2 tx pipes urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
>   urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps
>18Mbps 24Mbps
>   36Mbps 48Mbps 54Mbps [...] 
>

Looks like it should have already been fixed.

christos



current panics with urtwn0

2017-01-23 Thread Stefan Hertenberger
Spam detection software, running on the system "vbsd.hertenberger.bayern",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
postmaster for details.

Content preview:  hello, latest current panics when my usb wireless device is
   attached. i made a screenshot of the stack trace => 
https://www.alarum.de/netbsd/20170123_205754.jpg
   urtwn0 at uhub4 port 4 urtwn0: Realtek 802.11n WLAN Adapter, rev 2.00/2.00,
   addr 4 urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:88:7c:ad
   urtwn0: 1 rx pipe, 2 tx pipes urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
   urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 
24Mbps
   36Mbps 48Mbps 54Mbps [...] 

Content analysis details:   (6.1 points, 5.0 required)

 pts rule name  description
 -- --
 0.9 SPF_FAIL   SPF: sender does not match SPF record (fail)
[SPF failed: Please see 
http://www.openspf.org/Why?s=mfrom;id=stefan%40hertenberger.bayern;ip=80.128.179.157;r=vbsd.hertenberger.bayern]
 3.6 RCVD_IN_PBLRBL: Received via a relay in Spamhaus PBL
[80.128.179.157 listed in zen.spamhaus.org]
 1.6 RCVD_IN_BRBL_LASTEXT   RBL: No description available.
[80.128.179.157 listed in bb.barracudacentral.org]
 0.0 URIBL_BLOCKED  ADMINISTRATOR NOTICE: The query to URIBL was 
blocked.
See

http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
 for more information.
[URIs: alarum.de]


--- Begin Message ---
hello,

latest current panics when my usb wireless device is attached. i made
a screenshot of the stack trace =>
https://www.alarum.de/netbsd/20170123_205754.jpg

urtwn0 at uhub4 port 4
urtwn0: Realtek 802.11n WLAN Adapter, rev 2.00/2.00, addr 4
urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 74:da:38:88:7c:ad
urtwn0: 1 rx pipe, 2 tx pipes
urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps
24Mbps 36Mbps 48Mbps 54Mbps
--- End Message ---


Re: reproducible kernel crash in NetBSD 7.1_RC1

2017-01-23 Thread Christos Zoulas
In article ,
 <6b...@6bone.informatik.uni-leipzig.de> wrote:
>Hello,
>
>on NetBSD 7.1_RC1 (and earlier) I can create a kernel crash as follows:
>
>ifconfig ixg0 ip4csum tcp4csum udp4csum tcp6csum udp6csum ip4csum-tx
>ip4csum-rx tcp4csum-tx tcp4csum-rx udp4csum-tx udp4csum-rx tcp6csum-tx
>tcp6csum-rx udp6csum-tx udp6csum-rx tso4 tso6
>ifconfig vlan850 create
>ifconfig vlan850 vlan 850 vlanif ixg0 up
>ifconfig vlan850 ip4csum tcp4csum udp4csum tcp6csum udp6csum ip4csum-tx
>ip4csum-rx tcp4csum-tx tcp4csum-rx udp4csum-tx udp4csum-rx tcp6csum-tx
>tcp6csum-rx udp6csum-tx udp6csum-rx tso4 tso6
>ifconfig vlan850 destroy
>
>If it does not work, try again.
>
>Can anyone take a look at the problem or should I make a bug report?
>
>Thank you for your efforts

I think that the vlan creation/removal code is racy even under /current.
There was some discussion recently about it.

christos



Re: OpenVPN causes fresh -current to crash

2017-01-23 Thread Jarle Greipsland
Tom Ivar Helbekkmo  writes:
[ ... ]
> Oh, and it's the client that hangs; the server seems to be just fine,
> and a reboot of the client makes NFS reads behave normally again.  On
> the server, the output file got created, but is zero bytes.  The error
> logged on the client when it gets stuck is this console output:
[ ... ]
Could this be another manifestation of PR/50432?

-jarle
--
"A firewall that lets NFS through is like a seatbelt that is designed to let
 your face reach the dashboard."
-- m...@tis.com (Marcus J Ranum)


Re: OpenVPN causes fresh -current to crash

2017-01-23 Thread Tom Ivar Helbekkmo
Ryota Ozaki  writes:

> The latest pfil.c (v1.34) should fix the panic. Could you try it?

I'll give it a go tonight, and report back.

Meanwhile, do you think this ongoing MPSAFE work may have some unwanted
consequences for NFS?  There's a problem that's been around for at least
a couple of months, but that I only discovered the other day -- I was
running with kernels from late October then, and the problem I observed
is still there after upgrading.

Reading NFS file systems is no problem, which is why I didn't notice it
before, but writing hangs.  Here's an example: I started compiling a C
source file directly to an executable on an NFS mounted file system
(server and client both amd64 running fresh -current).  The compile pass
is fine, but when the ld end of the pipeline wants to write the
executable, it hangs.  So I try to do a 'df' in another terminal, and it
hangs.  Finally, I simply attempt to make 'ls -l [target executable]'
show me if it's written anything yet, and that hangs, too: after an
attempt to write has hung the communication up, reads no longer work,
either:

 UID   PID  PPID   CPU PRI  NI VSZ RSS WCHAN   STAT TTY  TIME 
COMMAND
   0 22179 22678 0 124   0   333445136 netio   D+   pts/170:00.01 
ld [...]
 501 21370 21006   516  85   089521144 nfsrcv  I+   pts/180:00.00 df
 501 21710 1 0 127   089641116 tstile  Dpts/20-   0:00.00 
/bin/ls [...]

Once I have something with "tstile" in the "WCHAN" column, I know that
I can't just reboot the machine: it's going to take a hard reset.

Oh, and it's the client that hangs; the server seems to be just fine,
and a reboot of the client makes NFS reads behave normally again.  On
the server, the output file got created, but is zero bytes.  The error
logged on the client when it gets stuck is this console output:

nfs send error 64 for barsoom:/usr/local

...and then the normal "nfs server not responding" messages in syslog
after that, of course.

-tih
-- 
Most people who graduate with CS degrees don't understand the significance
of Lisp.  Lisp is the most important idea in computer science.  --Alan Kay