On 18 January 2016 at 23:20, Jonathon Sisson <[email protected]> wrote: > On Mon, Jan 18, 2016 at 08:30:21PM +0100, Mike Belopuhov wrote: >> On Mon, Jan 18, 2016 at 20:25 +0100, Mike Belopuhov wrote: >> > On Sun, Jan 17, 2016 at 20:46 -0800, Jonathon Sisson wrote: >> > > Hi, >> > > >> > > First off, thank you for OpenBSD in general, and thank you specifically >> > > for the PV drivers on OpenBSD =) The day of migrating workloads to AWS >> > > gets ever closer for me, and I appreciate everything the OpenBSD dev >> > > team does. >> > > >> > > I've found what appears to be a repeatable crash that results in this: >> > > >> > > panic: xnf0: save vs spell: 214 >> > > >> > > Stopped at Debugger+0x9: leave >> > > TID PID UID PRFLAGS PFLAGS CPU COMMAND >> > > 14532 9243 0 0x3 0x4000000 1 python2.7 >> > > * 7215 9243 0 0x3 0x4000000 0 python2.7 >> > > >> > > Debugger() at Debugger+0x9 >> > > panic() at panic+0xfe >> > > xnf_encap() at xnf_encap+0x1a9 >> > > xnf_start() at xnf_start+0x7f >> > > ifq_serialize() at ifq_serialize+0xd9 >> > > if_enqueue() at if_enqueue+0x71 >> > > ether_output() at ether_output+0x166 >> > > ip_output() at ip_output+0x6d3 >> > > tcp_output() at tcp_output+0x87e >> > > tcp_usrreq() at tcp_usrreq+0x3fc >> > > sosend() at sosend+0x3d8 >> > > dofilewritev() at dofilewritev+0x205 >> > > sys_write() at sys_write+0x89 >> > > syscall() at syscall+0x368 >> > > --- syscall (number 4) --- >> > > end of kernel >> > > end trace frame: 0x9a8c96a2800, count: 1 >> > > 0x9a91790279a: >> > > --db_more-- >> > > >> > > I'm unable to run further commands at the console, as AWS does not >> > > provide console. >> > > >> > > I'm using this test machine to build CURRENT and upload it to an s3 >> > > bucket that I've been using for STABLE builds. The python code is >> > > the awscli installed via py-pip running on Python 2.7.11. The precise >> > > command is: >> > > >> > > aws s3 sync /usr/rel/ s3://$AWS_BUCKET_NAME/path/ >> > > >> > > If there is any further testing I can provide, I am more than happy >> > > to provide any details you need. >> > > >> > > -Jonathon >> > > >> > >> > Can you please try the diff below on top of a -current kernel >> > (I've pushed some additional Xen fixes just now). >> > >> > You should be able to copy the kernel into the AWS instance. >> > >> > My math wasn't correct here and txeof would unload a chain before >> > we would've processed all descriptors/fragments. >> > >> >> A slight amendment to the diff (forgot one chunk). >> >> > Of course, 2 minutes after I sent that last email I get this: > > panic: xnf0: save vs spell: 129 > > Stopped at Debugger+0x9: leave > TID PID UID PRFLAGS PFLAGS CPU COMMAND > *13456 16606 0 0x3 0x4000000 1 python2.7 > 10910 10910 0 0x14000 0x210 0 softnet > > Debugger() at Debugger+0x9 > panic() at panic+0xfe > xnf_encap() at xnf_encap+0x1b7 > xnf_start() at xnf_start+0x7f > ifq_serialize() at ifq_serialize+0xd9 > if_enqueue() at if_enqueue+0x71 > ether_output() at ether_output+0x166 > ip_output() at ip_output+0x6d3 > tcp_output() at tcp_output+0x87e > tcp_usrreq() at tcp_usrreq+0x3fc > sosend() at sosend+0x3d8 > dofilewritev() at dofilewritev+0x205 > sys_write() at sys_write+0x89 > syscall() at syscall+0x368 > --- syscall (number 4) --- > end of kernel > end trace frame: 0x2f508906d00, count: 1 > 0x2f4e69f079a: > --db_more-- > > This time took considerably longer, as multiple uploads were successful > prior to the panic. This time I had to download install58.iso and then > upload it to a test bucket to get it to panic. > > My apologies for the false confirmation. > > -Jonathon
That's OK. Thank you for taking your time to test it. I can reproduce the problem with tcpbench as well and hopefully will have a solution soon.
