On Fri, Aug 17, 2018 at 12:59:57PM +0200, Jaromir Dolecek wrote:
>
> yes, one of the problems is the code happily handles stale bufs. it does not
> clear the buf pointer when the response is already handled. we should add
> some KASSERTs there for this and clear the response structure on reuse.
> Le 17 août 2018 à 07:07, Michael van Elst a écrit :
>
>> On Fri, Aug 17, 2018 at 02:23:16AM +, Emmanuel Dreyfus wrote:
>>
>>blkif_response_t *rep = RING_GET_RESPONSE(&sc->sc_ring, i);
>>struct xbd_req *xbdreq = &sc->sc_reqs[rep->id];
>>bp
On Fri, Aug 17, 2018 at 02:23:16AM +, Emmanuel Dreyfus wrote:
> blkif_response_t *rep = RING_GET_RESPONSE(&sc->sc_ring, i);
> struct xbd_req *xbdreq = &sc->sc_reqs[rep->id];
> bp = xbdreq->req_bp;
>
> It decides to call dk_done for the last occu
On Fri, Aug 10, 2018 at 06:55:46AM -, Michael van Elst wrote:
> a queued operation eventually returns with a call to xbd_handler.
> - for every buffer returned, dk_done is called which finally ends
> in invoking biodone.
After adding debug statements, I can now tel the offending buf_t
is qu
m...@netbsd.org (Emmanuel Dreyfus) writes:
>I can tell that in vfs_bio.c, bread() -> bio_doread() will call
>VOP_STRATEGY once for the offendinf buf_t, but biodone() is called twice
>in interrupt context for the buf_t, leading to the biodone2 already
>panic later.
>Since you know the xbd code you
Emmanuel Dreyfus wrote:
> > xbd is not mpsafe, so it shouldn't be even race due to parallell
> > processing on different CPUs. Maybe it would be useful to check if the
> > problem still happens when you assign just single CPU to the DOMU.
>
> I get the crash with vcpu = 1 for the domU. I
On Wed, Aug 08, 2018 at 10:30:23AM +0700, Robert Elz wrote:
> This suggests to me that something is getting totally scrambled in
> the buf headers when things get busy.
I tried to crash with BIOHIST enabled, here is he story about the
buf_t that triggers the panic beause it as BO_DONE:
1533732322
Robert Elz wrote:
> This suggests to me that something is getting totally scrambled in
> the buf headers when things get busy.
I tried dumping the buf_t before panic, to check if it could be
completely corrupted, but it seems it is not the case. Iblkno is
4904744, filesystem has 131891200 block
For what it is worth, and in this case it might not be much, I did a similar
test on my test amd64 DomU last night.
Running dump and /etc/daily in parallel did nothing, but running lots of
them in parallel eventually did cause a crash.
But I saw a different crash -- rather than a biodone2, I got
Jaromír Dole?ek wrote:
> Thanks. Could you please try a -current kernel for DOMU and see if it
> crashes the same? If possible a DOMU kernel from daily builds, to rule
> out local compiler issue.
It crashes the same way with a kernel from 201808050730. here is uname
-a output:
NetBSD bacasable
2018-08-07 18:42 GMT+02:00 Emmanuel Dreyfus :
> kern/53506
Thanks. Could you please try a -current kernel for DOMU and see if it
crashes the same? If possible a DOMU kernel from daily builds, to rule
out local compiler issue.
There are not really many differences in xbd/evtchn code itself
betwe
Martin Husemann wrote:
> Please file a PR.
kern/53506
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
Emmanuel Dreyfus wrote:
> /sbin/dump -a0f /dev/null /
> sh /etc/daily
The second command can be replaced by a simple
grep -r something /etc
But so far I did not managed to crash without running dump.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
On Tue, Aug 07, 2018 at 05:30:27PM +0200, Emmanuel Dreyfus wrote:
> I can reproduce the crash at will: running at the same time the two
> following commands reliabily trigger "panic biodone2 already"
>
> /sbin/dump -a0f /dev/null /
> sh /etc/daily
Please file a PR.
Martin
Jaromír Dole?ek wrote:
> This is always a bug, driver processes same buf twice. It can do harm.
> If the buf is reused for some other I/O, system can fail to store
> data, or claim to read data when it didn't.
I can reproduce the crash at will: running at the same time the two
following commands
Emmanuel Dreyfus wrote:
> Here it is.
And here is another flavor below
I am now convinced the problem came with NetBSD 8.0:
I found that two other domU crashed on daily backup since
NetBSD 8.0 upgrade, and the panic is also biodone2 already.
I start downgrading today.
uvm_fault(0xc06d4960, 0
Martin Husemann wrote:
> What driver is this?
xbd, this is an NetBSD-8.0/i386 Xen domU on top of a NetBSD-8.0/amd64
dom0 running on Xen 4.8.3.
In the dom0, the disk image is in a file in a FFSv2 filesystem on a
RAIDframe RAID 1, with two wd disks.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pub
On Mon, Aug 06, 2018 at 08:37:56PM +0200, Emmanuel Dreyfus wrote:
> cpu0: Begin traceback...
> vpanic(c04da74f,dcbdfdd4,dcbdfe58,c010bb65,c04da74f,dcbdfe64,dcbdfe64,4,dcbde2c0,210202)
> at netbsd:vpanic+0x12d
> panic(c04da74f,dcbdfe64,dcbdfe64,4,dcbde2c0,210202,fd893fff,8,c06ba580,ff491)
> at net
Jaromír Dole?ek wrote:
> Can you give full backtrace?
Here it is. I wonder if it could change things without -o log
cpu0: Begin traceback...
vpanic(c04da74f,dcbdfdd4,dcbdfe58,c010bb65,c04da74f,dcbdfe64,dcbdfe64,4,dcbde2c0,210202)
at netbsd:vpanic+0x12d
panic(c04da74f,dcbdfe64,dcbdfe64,4,dcbde2
This is always a bug, driver processes same buf twice. It can do harm.
If the buf is reused for some other I/O, system can fail to store
data, or claim to read data when it didn't.
Can you give full backtrace?
Jaromir
2018-08-06 17:56 GMT+02:00 Emmanuel Dreyfus :
> Hello
>
> I have a Xen domU th
20 matches
Mail list logo