Re: [PATCH 12/12] Btrfs: Fix file clone when source offset is not 0
Jan Schmidt wrote: On 30.01.2012 07:33, Li Zefan wrote: Jan Schmidt wrote: I was looking at the clone range ioctl and have some remarks: On 27.01.2011 09:46, Li Zefan wrote: diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index f87552a..1b61dab 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1788,7 +1788,10 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, memcpy(new_key, key, sizeof(new_key)); new_key.objectid = inode-i_ino; - new_key.offset = key.offset + destoff - off; + if (off = key.offset) + new_key.offset = key.offset + destoff - off; + else + new_key.offset = destoff; ^^^ 1) This looks spurious to me. What if destoff isn't aligned? That's what the key.offset - off code above is for. Before the patch, the code didn't work at all, I agree. But this fix can only work for aligned requests. 2) The error in new_key also has propagated to the extent item's backref and wasn't fixed there. I did a range clone and ended up with an extent item like that: item 30 key (1318842368 EXTENT_ITEM 131072) itemoff 1047 itemsize 169 extent refs 8 gen 11103 flags 1 [...] extent data backref root 257 objectid 272 offset 18446744073709494272 count 1 The last offset (equal to -14 * 4k) is obviously wrong. I didn't figure out how the variables are computed, but it looks like there's something wrong with the datao u64 to me. Unfortunately this is expected. The calculation is: extent_item.extent_data_ref.offset = file_pos - file_extent.extent_offset so you may get negative offset. I see where the negative offset comes from. But what can this offset be used for? The design idea was to reduce the number of extent backrefs in that a data backref can point to different file extents in the same file (in this case the count field 1). We didn't expect nagetive offset until range clone was implemented. Reducing the number of backrefs is a good thing. In case of count 1, it's clear that the offset cannot reference all of the extent data items. We have different design choices: a) Use the above computation and leave the filesystem with an unusable offset value for extent backrefs. b) Use either one of the extent data item offsets this backref references. c) Always use a predefined constant (like 0 or -1) when count 1. d) Disallow count 1 for those refs and turn them into shared refs as soon as count gets 1. I expressed the same doubt. See this thread: http://marc.info/?t=13142591281r=1w=2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: brtfs on top of dmcrypt with SSD. No corruption iff write cache off?
On Sun, Jan 29, 2012 at 04:37:54PM -0800, Marc MERLIN wrote: Howdy, I'm considering using brtfs for my new laptop install. Encryption is however a requirement, and ecryptfs doesn't quite cut it for me, so that leaves me with dmcrypt which is what I've been using with ext3/ext4 for years. https://btrfs.wiki.kernel.org/articles/g/o/t/Gotchas.html still states that 'dm-crypt block devices require write-caching to be turned off on the underlying HDD' While the report was for 2.6.33, I'll assume it's still true. I was considering migrating to a recent 256GB SSD and 3.2.x kernel. First, I'd like to check if the 'turn off write cache' comment is still accurate and if it does apply to SSDs too. Second, I was wondering if anyone is running btrfs over dmcrypt on an SSD and what the performance is like with write cache turned off (I'm actually not too sure what the impact is for SSDs considering that writing to flash can actually be slower than writing to a hard drive). Performance without the cache on is going to vary wildly from one SSD to another. Some really need it to give them nice fat writes while others do better on smaller writes. It's best to just test yours and see. With a 3.2 kernel (it really must be 3.2 or higher), both btrfs and dm are doing the right thing for barriers. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs bug
On Tue, Jan 31, 2012 at 11:20 PM, Thomas Weber thomas.weber.li...@googlemail.com wrote: Hello Mitch, I have good news for you. I looked through all log files and found in the everything.log the following: Regards, Thomas Jan 31 05:12:24 localhost kernel: [87276.968049] btrfs memmove bogus src_offset 1870 move len 687876531 len 4096 Jan 31 05:12:24 localhost kernel: [87276.968136] [ cut here ] Jan 31 05:12:24 localhost kernel: [87276.968222] kernel BUG at fs/btrfs/extent_io.c:4357! Jan 31 05:12:24 localhost kernel: [87276.968296] invalid opcode: [#1] PREEMPT SMP [...snip...] This is coming from a BUG_ON(1) in the memcpy_extent_buffer() function in extent_io.c if (src_offset + len dst-len) { printk(KERN_ERR btrfs memmove bogus src_offset %lu move len %lu dst len %lu\n, src_offset, len, dst-len); BUG_ON(1); } So, since (1870 + 687876531) 4096, the BUG_ON was triggered. There are two calls to memcpy_extent_buffer() from setup_items_for_insert (the next function back in the callback shown from the BUG_ON), so that part makes sense, at least. I don't know if anybody else has anything to say on this, but my best guess is that this btrfs volume has picked up some corruptions that are feeding in some bad values. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3.2.1] BUG at fs/btrfs/inode.c:1588
Interestingly, the filesystem was not unmountable - system hung. After reisub and checking again with btrfs scrub no errors where reported and it just rsync'ed fine this time. This does not make sense to me. In any case here's my backup script although I see nothing special with it: #!/bin/bash DATE=$(date +%Y%m%d-%H%M) BASEDIR=/mnt/usb-backup ionice -c3 -p$$ mount ${BASEDIR} ( [ -d ${BASEDIR}/snapshots/system-${DATE} ] || ( time rsync -avAXH --delete --inplace --no-whole-file --stats \ --exclude /proc/ \ --exclude /dev/ \ --exclude /sys/ \ --exclude /boot/ \ --exclude /media/ \ --exclude /mnt/ \ / ${BASEDIR}/current/ btrfs subvolume snapshot \ ${BASEDIR}/current \ ${BASEDIR}/snapshots/system-${DATE} btrfs filesystem sync ${BASEDIR} ) umount /mnt/usb-backup ) Kai Krakow hurikhan77+bt...@gmail.com schrieb: Just happened while writing a huge avi file to my usb3 backup disk: [356036.596292] [ cut here ] [356036.596300] kernel BUG at fs/btrfs/inode.c:1588! [356036.596304] invalid opcode: [#1] SMP [356036.596307] CPU 2 [356036.596309] Modules linked in: vmnet(O) vmblock(O) vsock(O) vmci(O) vmmon(O) af_packet snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss nls_iso8859_15 nls_cp437 vfat fat btusb bluetooth zram(C) loop snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_seq_device gspca_sonixj gspca_main videodev v4l2_compat_ioctl32 pcspkr i2c_i801 evdev unix fuse xfs nfs nfs_acl auth_rpcgss lockd sunrpc reiserfs scsi_wait_scan hid_monterey hid_microsoft hid_logitech hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech usbhid usb_storage hid sr_mod cdrom sg pata_cmd64x [last unloaded: microcode] [356036.596346] [356036.596349] Pid: 28747, comm: btrfs-fixup-1 Tainted: G C O 3.2.1-gentoo-r2 #1 To Be Filled By O.E.M. To Be Filled By O.E.M./Z68 Pro3 [356036.596355] RIP: 0010:[811605fe] [811605fe] btrfs_writepage_fixup_worker+0xde/0x121 [356036.596363] RSP: :8801e2379de0 EFLAGS: 00010246 [356036.596366] RAX: RBX: ea00019b1a00 RCX: [356036.596370] RDX: RSI: RDI: 88008a1bbb40 [356036.596373] RBP: 033fd000 R08: 8801e2379d2c R09: 000180240024 [356036.596377] R10: R11: bf80bf80 R12: 88008a1bbc10 [356036.596380] R13: R14: 8801e2379df8 R15: 033fdfff [356036.596384] FS: () GS:88043fb0() knlGS: [356036.596387] CS: 0010 DS: ES: CR0: 8005003b [356036.596391] CR2: 7f5ef966 CR3: 0003253f2000 CR4: 000406e0 [356036.596394] DR0: DR1: DR2: [356036.596398] DR3: DR6: 0ff0 DR7: 0400 [356036.596401] Process btrfs-fixup-1 (pid: 28747, threadinfo 8801e2378000, task 8802d2160650) [356036.596405] Stack: [356036.596407] 88008a1bbab0 88026d847540 88003a7c1f50 [356036.596412] 88019bf62d80 88019bf62dd0 88019bf62d98 [356036.596417] 88019bf62da8 8802d2160650 88019bf62d88 8117f23f [356036.596421] Call Trace: [356036.596426] [8117f23f] ? worker_loop+0x170/0x485 [356036.596431] [8117f0cf] ? btrfs_queue_worker+0x272/0x272 [356036.596435] [8117f0cf] ? btrfs_queue_worker+0x272/0x272 [356036.596439] [810489fb] ? kthread+0x7a/0x82 [356036.596445] [81444634] ? kernel_thread_helper+0x4/0x10 [356036.596449] [81048981] ? kthread_worker_fn+0x135/0x135 [356036.596453] [81444630] ? gs_change+0xb/0xb [356036.596456] Code: 00 00 4c 89 f1 48 8b 3c 24 e8 67 4f 01 00 48 89 df [e8 b2 70 f2 ff ba 01 00 00 00 4c 89 ee 4c 89 e7 e8 bf 29 01 00 e9 4b ff ff ff 0f 0b 41 b8 50 00 00 00 4c 89 f1 4c 89 fa 48 89 ee 48 8b 3c 24 [356036.596478] RIP [811605fe] btrfs_writepage_fixup_worker+0xde/0x121 [356036.596483] RSP 8801e2379de0 [356036.653626] ---[ end trace 9fa19a7644192fb6 ]--- btrfsck now finds many of these: jupiter ~ # btrfsck /dev/sde1 root 256 inode 12746 errors 400 root 256 inode 12747 errors 400 root 256 inode 12748 errors 400 root 256 inode 12749 errors 400 root 256 inode 17141 errors 400 root 256 inode 219966 errors 400 root 256 inode 224243 errors 400 root 256 inode 225245 errors 400 root 256 inode 225354 errors 400 root 256 inode 290639 errors 2000 root 256 inode 291751 errors 2000 This disk is used for nothing else then the following cycle: 1. mount it (compress-force=gzip) 2. run rsync to backup my system (using cow-friendly rsync flags) 3. create a snapshot 4. unmount it So that error must have been introduced simply by running rsync. It has plenty of free space (about 800 GB). -- To
Re: brtfs on top of dmcrypt with SSD. No corruption iff write cache off?
On Wed, Feb 01, 2012 at 12:56:24PM -0500, Chris Mason wrote: Second, I was wondering if anyone is running btrfs over dmcrypt on an SSD and what the performance is like with write cache turned off (I'm actually not too sure what the impact is for SSDs considering that writing to flash can actually be slower than writing to a hard drive). Performance without the cache on is going to vary wildly from one SSD to another. Some really need it to give them nice fat writes while others do better on smaller writes. It's best to just test yours and see. With a 3.2 kernel (it really must be 3.2 or higher), both btrfs and dm are doing the right thing for barriers. Thanks for the answer. Can you confirm that I still must disable write cache on the SSD to avoid corruption with btrfs on top of dmcrypt, or is there a chance that it just works now? Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html