Hi,
I'm hitting this bug with both ext4 and btrfs.
Here's an example of the backtrace:
https://gist.github.com/vzctl/e888a821333979120932
I tried raising this BUG only for direct ring and it solved the problem:
- BUG_ON(total_sg vq-vring.num);
+ BUG_ON(total_sg vq-vring.num !vq-indirect);
Shall I submit the patch or is a more elaborate fix required?
--
Alexey
On Mon, May 27, 2013 at 7:38 AM, Rusty Russell ru...@rustcorp.com.au wrote:
Dave Airlie airl...@gmail.com writes:
correct.
If I have an indirect ring and I'm adding sgs to it and the host is
delayed (say I've got a thread consuming things from the vring and its
off doing something interesting),
I'd really like to get ENOSPC back from virtqueue_add. However if the
indirect addition fails due to free_sg being 0, we hit the BUG_ON
before we ever get to the ENOSPC check.
It is correct for the moment: drivers can't assume indirect buffer
support in the transport.
BUT for a new device, we could say this depends on indirect descriptor
support, put the appropriate check in the device init, and then remove
the BUG_ON().
But if the transport has indirect buffer support, can it change its
mind at runtime?
It's a layering violation. The current rule is simple:
A driver should never submit a buffer which can't fit in the
ring.
This has the nice property that OOM (ie. indirect buffer alloction
fail) just slows things down, doesn't break things.
In this case we have vq-indirect set, but the device has run out of
free buffers,
but it isn't a case that in+out would overflow it if it had free
buffers since it would use
an indirect and succeed.
OK, but when do you re-xmit? What if the ring is empty, and you
submitted a buffer that needs indirect? You won't get interrupted
again, because the device has consumed all the buffers. You need to
have your own timer or something equally hackish.
Getting -ENOSPC is definitely what should happen from what I can see,
not a BUG_ON,
I should get a BUG_ON only if the device reports no indirect support.
I think we should return -ENOMEM if we can't indirect because of failed
allocation and it doesn't fit in the ring, ie:
This:
BUG_ON(total_sg vq-vring.num);
BUG_ON(total_sg == 0);
if (vq-vq.num_free total_sg) {
pr_debug(Can't add buf len %i - avail = %i\n,
total_sg, vq-vq.num_free);
/* FIXME: for historical reasons, we force a notify here if
* there are outgoing parts to the buffer. Presumably the
* host should service the ring ASAP. */
if (out_sgs)
vq-notify(vq-vq);
END_USE(vq);
return -ENOSPC;
}
Becomes (untested):
BUG_ON(total_sg == 0);
if (vq-vq.num_free total_sg) {
if (total_sg vq-vring.num) {
BUG_ON(!vq-indirect);
/* Return -ENOMEM only if we have nothing else to do */
if (vq-vq.num_free == vq-vring.num)
return -ENOMEM;
}
pr_debug(Can't add buf len %i - avail = %i\n,
total_sg, vq-vq.num_free);
/* FIXME: for historical reasons, we force a notify here if
* there are outgoing parts to the buffer. Presumably the
* host should service the ring ASAP. */
if (out_sgs)
vq-notify(vq-vq);
END_USE(vq);
return -ENOSPC;
}
Cheers,
Rusty.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization