Re: [Qemu-devel] (v2. forward to qemu )-Panic with ext4, nbd, qemu-img, block

2018-01-25 Thread Eric Blake
On 01/21/2018 08:06 PM, Hongzhi, Song wrote:
> Hello,
> 
> I create a virtual disk-image using qemu-img.
> 
> And then I use /dev/nbd to map the image.
> 
> I mount the /dev/nbd to a local dir with ext4-format
> 
> Finally, I have some trouble about ext4-filesystem and block device,
> with using demand of rsync or dd to write the image.
> 
> Reproduce :
> 
>     qemu-img create test.img 2G
> 
>     mkfs.ext4 -F test.img
> 
>  qemu-nbd -f raw -c /dev/nbd0 test.img
> 
>  mount -r ext4 /dev/nbd0 LOCAL_DIR/
> 
>     rsync -av META_DATA_DIR/  LOCAL_DIR/
> 
> Qemu Version:
> 
>     QEMU emulator version 2.10.0

There have been some bug fixes in the NBD code in qemu 2.11; does using
a newer version make a difference in your results?


> Detail:
> 
> 
> 329.11 EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts:
> (null)
> 329.12 block nbd0: Connection timed out
> 329.13 block nbd0: shutting down sockets

This sounds like a log of the kernel side; but it is rather sparse on
details on why the kernel lost the connection to the socket provided by
qemu-nbd -c.  Is there any chance we can get a corresponding trace from
qemu-nbd when reproducing the lost connection?

> 329.14 blk_update_request: I/O error, dev nbd0, sector 304384
> 329.15 blk_update_request: I/O error, dev nbd0, sector 304640
> 329.16 blk_update_request: I/O error, dev nbd0, sector 304896
> 329.17 blk_update_request: I/O error, dev nbd0, sector 305152
> 329.18 blk_update_request: I/O error, dev nbd0, sector 305408
> 329.19 blk_update_request: I/O error, dev nbd0, sector 305664
> 329.20 blk_update_request: I/O error, dev nbd0, sector 305920
> 329.21 blk_update_request: I/O error, dev nbd0, sector 306176
> 329.22 blk_update_request: I/O error, dev nbd0, sector 306432
> 329.23 blk_update_request: I/O error, dev nbd0, sector 306688
> 329.24 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5
> writing to inode 160 (offset 8388608 size 8388608 starting block 38400)

Everything else in the trace looks like fallout from the initial lost
connection - once the kernel can't communicate to the NBD server, it has
to fail all pending and subsequent I/O requests to /dev/nbd0.  But until
we can figure out why the connection is dropped, seeing this part of the
trace doesn't add any information about the root cause.

But oddly enough, once things go south in the kernel nbd module, it
leads to a full-on kernel bug:

> GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
> 329.51 Workqueue: writeback wb_workfn (flush-43:0)
> 329.52 task: 977bec759e00 task.stack: a2930524c000
> 329.53 RIP: 0010:submit_bh_wbc+0x155/0x160
> 329.54 RSP: 0018:a2930524f7e0 EFLAGS: 00010246
> 329.55 RAX: 00620005 RBX: 977f05cddc18 RCX: 
> 329.56 RDX: 977f05cddc18 RSI: 00020800 RDI: 0001
> 329.57 RBP: a2930524f808 R08: ff00 R09: 00ff
> 329.58 R10: a2930524f920 R11: 058c R12: a598
> 329.59 R13: ba15c500 R14: 977fe1bab400 R15: 977fea643000
> 329.60 FS: () GS:977befa0()
> knlGS:
> 329.61 CS: 0010 DS:  ES:  CR0: 80050033
> 329.62 CR2: 7f7d7010 CR3: 00035ce0e000 CR4: 001406e0
> 329.63 Call Trace:
> 329.64 __sync_dirty_buffer+0x41/0xa0
> 329.65 ext4_commit_super+0x1d6/0x2a0
> 329.66 __ext4_error_inode+0xb2/0x170

> 329.99 JBD2: Error -5 detected when updating journal superblock for nbd0-8.
> 329.100 Aborting journal on device nbd0-8.
> 329.101 [ cut here ]
> 329.102 kernel BUG at /kernel-source//fs/buffer.c:3091!

Well, that should certainly be reported to the kernel folks; nothing
qemu can do about it (a userspace socket serving NBD data should not be
able to cause the kernel NBD client to result in a subsequent kernel
crash, regardless of how bad data loss is when the socket disappears out
from under the kernel).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] (v2. forward to qemu )-Panic with ext4, nbd, qemu-img, block

2018-01-21 Thread Hongzhi, Song

Hello,

I create a virtual disk-image using qemu-img.

And then I use /dev/nbd to map the image.

I mount the /dev/nbd to a local dir with ext4-format

Finally, I have some trouble about ext4-filesystem and block device, 
with using demand of rsync or dd to write the image.


Reproduce :

    qemu-img create test.img 2G

    mkfs.ext4 -F test.img

 qemu-nbd -f raw -c /dev/nbd0 test.img

 mount -r ext4 /dev/nbd0 LOCAL_DIR/

    rsync -av META_DATA_DIR/  LOCAL_DIR/

Qemu Version:

QEMU emulator version 2.10.0
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU \
Project developers

Kernel Version:

    4.12+HEAD (I have test 4.15-rc7 and 4.12)

Machine:

    intel-x86-64

CPU:

    xeon

Architecture:

    I86

The problem is not always  appear step by step, but often using lava 
that is a auto-testing method.


And it just appears in some special board.

If someone has similar trouble or has resolution or wants more detail, 
please connect with me. Thanks.


Detail:


329.11 EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: 
(null)

329.12 block nbd0: Connection timed out
329.13 block nbd0: shutting down sockets
329.14 blk_update_request: I/O error, dev nbd0, sector 304384
329.15 blk_update_request: I/O error, dev nbd0, sector 304640
329.16 blk_update_request: I/O error, dev nbd0, sector 304896
329.17 blk_update_request: I/O error, dev nbd0, sector 305152
329.18 blk_update_request: I/O error, dev nbd0, sector 305408
329.19 blk_update_request: I/O error, dev nbd0, sector 305664
329.20 blk_update_request: I/O error, dev nbd0, sector 305920
329.21 blk_update_request: I/O error, dev nbd0, sector 306176
329.22 blk_update_request: I/O error, dev nbd0, sector 306432
329.23 blk_update_request: I/O error, dev nbd0, sector 306688
329.24 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 8388608 size 8388608 starting block 38400)

329.25 Buffer I/O error on device nbd0, logical block 38144
329.26 Buffer I/O error on device nbd0, logical block 38145
329.27 Buffer I/O error on device nbd0, logical block 38146
329.28 Buffer I/O error on device nbd0, logical block 38147
329.29 Buffer I/O error on device nbd0, logical block 38148
329.30 Buffer I/O error on device nbd0, logical block 38149
329.31 Buffer I/O error on device nbd0, logical block 38150
329.32 Buffer I/O error on device nbd0, logical block 38151
329.33 Buffer I/O error on device nbd0, logical block 38152
329.34 Buffer I/O error on device nbd0, logical block 38153
329.35 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 8388608 size 8388608 starting block 38656)
329.36 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 8388608 size 8388608 starting block 38912)
329.37 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 39168)
329.38 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 39424)
329.39 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 39680)
329.40 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 39936)
329.41 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 40192)
329.42 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 40448)
329.43 EXT4-fs error (device nbd0): __ext4_get_inode_loc:4520: inode 
#222: block 174: comm kworker/u113:0: unable to read itable block
329.44 EXT4-fs warning (device nbd0): ext4_end_bio:322: I/O error -5 
writing to inode 160 (offset 16777216 size 8388608 starting block 40704)

329.45 [ cut here ]
329.46 kernel BUG at /kernel-source//fs/buffer.c:3091!
329.47 invalid opcode:  [#1] PREEMPT SMP
329.48 Modules linked in: nbd xt_CHECKSUM iptable_mangle ipt_REJECT 
nf_reject_ipv4 xt_tcpudp ebtable_filter ebtables ip6_tables 
ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink 
xfrm_user iptable_nat xt_addrtype iptable_filter ip_tables xt_conntrack 
x_tables br_netfilter bridge stp llc intel_rapl sb_edac intel_powerclamp 
coretemp crct10dif_pclmul crct10dif_common aesni_intel aes_x86_64 
crypto_simd cryptd glue_helper iTCO_wdt iTCO_vendor_support lpc_ich 
i2c_i801 wmi acpi_pad acpi_power_meter nfsd openvswitch nf_defrag_ipv6 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
kvm_intel kvm irqbypass fuse
329.49 CPU: 30 PID: 6 Comm: kworker/u113:0 Not tainted 
4.12.18-rt0-yocto-preempt-rt #1
329.50 Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS 
GRNDSDP1.86B.0036.R05.1407140519 07/14/2014

329.51 Workqueue: writeback wb_workfn (flush-43:0)