Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread james harvey
On Fri, Jun 29, 2018 at 1:09 PM, Austin S. Hemmelgarn wrote: > On 2018-06-29 11:15, james harvey wrote: >> >> On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy >> wrote: >>> >>> And an open question I have about scrub is weather it only ever is >>> c

Re: Major design flaw with BTRFS Raid, temporary device drop will corrupt nodatacow files

2018-06-29 Thread james harvey
On Thu, Jun 28, 2018 at 6:27 PM, Chris Murphy wrote: > And an open question I have about scrub is weather it only ever is > checking csums, meaning nodatacow files are never scrubbed, or if the > copies are at least compared to each other? Scrub never looks at nodatacow files. It does not

Re: [PATCH RFC] btrfs: Do extra device generation check at mount time

2018-06-28 Thread james harvey
On Thu, Jun 28, 2018 at 3:15 AM, Qu Wenruo wrote: > I'd like to make sure everyone, including developers and end-users, are > fine with the restrict error-out behavior. Yes, please error out, as a start. Requesting this was on my btrfs-to-do list. A device generation mismatch from a drive

Re: btrfs balance did not progress after 12H

2018-06-19 Thread james harvey
On Tue, Jun 19, 2018 at 11:47 AM, Marc MERLIN wrote: > On Mon, Jun 18, 2018 at 06:00:55AM -0700, Marc MERLIN wrote: >> So, I ran this: >> gargamel:/mnt/btrfs_pool2# btrfs balance start -dusage=60 -v . & >> [1] 24450 >> Dumping filters: flags 0x1, state 0x0, force is off >> DATA (flags 0x2):

Re: [PATCH RFC] btrfs-progs: map-logical: look at next leaf if slot > items

2018-06-07 Thread james harvey
"On Thu, Jun 7, 2018 at 3:26 AM, Qu Wenruo wrote: > On 2018年06月07日 15:19, james harvey wrote: >> On Thu, Jun 7, 2018 at 12:44 AM, Qu Wenruo wrote: >>> On 2018年06月07日 11:33, james harvey wrote: >>>> * btrfs_item_key_to_cpu sets key to (10955960320

Re: [PATCH v3 (Only) 2/3] btrfs-progs: map-logical: Use btrfs_next_extent_item()

2018-06-07 Thread james harvey
On Thu, Jun 7, 2018 at 4:50 AM, Su Yue wrote: > On 06/07/2018 03:20 PM, james harvey wrote: >> >> btrfs_next_extent_item() looks for BTRFS_EXTENT_ITEM_KEY and >> BTRFS_METADATA_KEY, >> which are the types we're looking for. >> >> Signed-off-by

[PATCH v3 (Only) 1/3] btrfs-progs: map-logical: look at next leaf if slot > items

2018-06-07 Thread james harvey
> number of items on this leaf. Signed-off-by: James Harvey --- btrfs-map-logical.c | 8 1 file changed, 8 insertions(+) diff --git a/btrfs-map-logical.c b/btrfs-map-logical.c index 7a8bcff9..8a4228a4 100644 --- a/btrfs-map-logical.c +++ b/btrfs-map-logical.c @@ -65,6 +65,14 @@ sta

[PATCH v2 2/3] btrfs-progs: map-logical: Use btrfs_next_extent_item()

2018-06-07 Thread james harvey
btrfs_next_extent_item() looks for BTRFS_EXTENT_ITEM_KEY and BTRFS_METADATA_KEY, which are the types we're looking for. Signed-off-by: James Harvey --- btrfs-map-logical.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/btrfs-map-logical.c b/btrfs-map-logical.c index

[PATCH v2 3/3] Fix misspelling of forward

2018-06-07 Thread james harvey
Signed-off-by: James Harvey --- btrfs-map-logical.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/btrfs-map-logical.c b/btrfs-map-logical.c index 8a41b037..59ba731b 100644 --- a/btrfs-map-logical.c +++ b/btrfs-map-logical.c @@ -39,7 +39,7 @@ static FILE *info_file

[PATCH v2 1/3] btrfs-progs: map-logical: look at next leaf if slot > items

2018-06-07 Thread james harvey
> number of items on this leaf. Signed-off-by: James Harvey --- btrfs-map-logical.c | 8 1 file changed, 8 insertions(+) diff --git a/btrfs-map-logical.c b/btrfs-map-logical.c index 7a8bcff9..2451012b 100644 --- a/btrfs-map-logical.c +++ b/btrfs-map-logical.c @@ -65,6 +65,14 @@ static

Re: [PATCH RFC] btrfs-progs: map-logical: look at next leaf if slot > items

2018-06-07 Thread james harvey
On Thu, Jun 7, 2018 at 12:20 AM, Su Yue wrote: > On 06/07/2018 11:33 AM, james harvey wrote: >> Using btrfs-progs v4.16: >> >> No extent found at range [10955980800,10955984896) >> >> But, this extent exists. btrfs-debug-tree shows: > Make sense. IMP th

[PATCH RFC] btrfs-progs: map-logical: look at next leaf if slot > items

2018-06-06 Thread james harvey
looks for both BTRFS_EXTENT_ITEM_KEY and BTRFS_METADATA_ITEM_KEY, which is what we need. (Granted, inside, it's just calling btrfs_next_item().) Also fixed misspelling of "foward" to "forward". Signed-off-by: James Harvey --- btrfs-map-logical.c | 18 +- 1 fi

Re: [PATCH RFC ver.B] btrfs: scrub: Don't use inode pages for device replace

2018-06-06 Thread james harvey
tle slower, but it does the correct csum checking and won't cause > such data corruption cause by "optimization". > > Reported-by: James Harvey > Signed-off-by: Qu Wenruo Reviewed-by: James Harvey As expected, this fixes the problem. I originally ran into this runnin

Re: [Bug 199931] New: systemd/rtorrent file data corruption when using echo 3 >/proc/sys/vm/drop_caches

2018-06-06 Thread james harvey
On Wed, Jun 6, 2018 at 3:06 PM, Marc Lehmann wrote: > On Tue, Jun 05, 2018 at 05:52:38PM -0400, james harvey > wrote: >> >> This is not always reproducible, but when deleting our journal, creating >> >> log >> >> messages for a few hours a

[PATCH] btrfs-progs: btrfs_close_devices(): only fsync() if device->writeable

2018-06-06 Thread james harvey
->devices, dev_list) { ... if (flags & O_RDWR) 332:device->writeable = 1 kernel btrfs_close_devices() does not have a corresponding fsync() that I see. Signed-off-by: James Harvey --- volumes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/volumes

[PATCH v2] btrfs-progs: device: Added verbose option to scan

2018-06-05 Thread james harvey
is used by chunk-recover and super-recover, is unaffected. (diff doesn't show the patch most optimally.) Signed-off-by: James Harvey --- cmds-device.c | 20 --- utils.c | 70 ++- utils.h | 5 ++-- 3 files changed, 84

Re: [PATCH] btrfs-progs: device: Added verbose option to scan

2018-06-05 Thread james harvey
On Tue, Jun 5, 2018 at 10:52 PM, james harvey wrote: > Only works when device(s) not specified. > ... Pretend the original email subject was: [PATCH] btrfs-progs: device: Added verbose option to scan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs&q

[PATCH] Added verbose option to btrfs device scan

2018-06-05 Thread james harvey
Only works when device(s) not specified. At verbose level 1, for each registered btrfs filesystem, compactly show fs uuid, and for each of its devices, the device id, name, and uuid. At verbose level 2, show everything for the fs and its devices. Previous behavior of print_all_devices(), which

Re: [Bug 199931] New: systemd/rtorrent file data corruption when using echo 3 >/proc/sys/vm/drop_caches

2018-06-05 Thread james harvey
On Tue, Jun 5, 2018 at 4:03 PM, Andrew Morton wrote: > On Tue, 05 Jun 2018 18:01:36 + bugzilla-dae...@bugzilla.kernel.org wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=199931 >> >> Bug ID: 199931 >>Summary: systemd/rtorrent file data corruption when using

Re: Questions from aspiring btrfs mini-debugger/mini-developer

2018-06-04 Thread james harvey
On Mon, May 28, 2018 at 8:48 AM, Qu Wenruo wrote: > On 2018年05月28日 17:21, james harvey wrote: >> #29, through btrfs-tree-debug, is: >> >> item 49 key (71469 EXTENT_DATA 3768320) itemoff 13232 itemsize 53 >> generation 218 type 1 (regular) >

Re: [PATCH] btrfs-progs: check: Also compare data between mirrors to detect corruption for NODATASUM extents

2018-05-29 Thread james harvey
On Tue, May 29, 2018 at 3:45 AM, Qu Wenruo wrote: > As the lzo corruption reported by James Harvey, for data extents without > checksum, neither btrfs check nor kernel scrub could detect anything > wrong. > > However if our profile supports duplication, we still have a cha

Questions from aspiring btrfs mini-debugger/mini-developer

2018-05-28 Thread james harvey
I'm tracking down some more bugs. Useful information for you to track down these bugs isn't in this email. This is more about an aspiring btrfs mini-debugger/mini-developer asking for some guidance, to be able to get the more useful information. I ran across some mirrored files that are

Re: off-by-one uncompressed invalid ram_bytes corruptions

2018-05-18 Thread james harvey
On Fri, May 18, 2018 at 1:49 AM, Qu Wenruo wrote: > And btrfs check doesn't report the same problem as the default original > mode doesn't have such check. > > Please also post the result of "btrfs check --mode=lowmem /dev/sda1" Are you saying "--mode=lowmem" does more

Re: btrfs device replace can cause silent or noisy corruption on compressed NOCOW/NODATASUM

2018-05-18 Thread james harvey
On Thu, May 17, 2018 at 5:46 AM, james harvey <jamespharve...@gmail.com> wrote: > ... > I of course don't know the extent of this. I don't know all of the > situations where NOCOW/NODATASUM extents are compressed anyway. In my > real world case, it was journald logs. We kno

Re: btrfs device replace can cause silent or noisy corruption on compressed NOCOW/NODATASUM

2018-05-18 Thread james harvey
On Thu, May 17, 2018 at 5:46 AM, james harvey <jamespharve...@gmail.com> wrote: > ... > In short, "btrfs device replace" caused it... > ... This should read "btrfs replace". Ran a 2 year old ISO, archlinux-2016.04.01-dual.iso. Kernel 4.4.5, btrfs-progs v4

btrfs device replace can cause silent or noisy corruption on compressed NOCOW/NODATASUM

2018-05-17 Thread james harvey
Looks like Qu may have taken care of corrupted compressed data with NODATASUM from causing causing random kernel memory corruption. As long as the compressed data was valid and could be uncompressed, there were no problems, even on data marked NOCOW/NODATASUM. If the data being sent to be

Re: [PATCH] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

2018-05-14 Thread james harvey
Don't know if this will help. I just learned about pstore, and see in there a dmesg that's interesting. The serial port kernel errors started this time with "BUG: unable to handle kernel paging request". The pstore dmesg has everything from there until the end of the first trace. But, the

Re: [PATCH] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

2018-05-14 Thread james harvey
On Mon, May 14, 2018 at 12:52 PM, David Sterba wrote: > On Mon, May 14, 2018 at 03:02:10PM +0800, Qu Wenruo wrote: >> As btrfs(5) specified: >> >> Note >> If nodatacow or nodatasum are enabled, compression is disabled. >> >> If NODATASUM or NODATACOW set, we should

Re: [PATCH] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

2018-05-14 Thread james harvey
On Mon, May 14, 2018 at 6:35 AM, Qu Wenruo wrote: > And if possible, please don't just remove those offending files (yet). > Your binary dump would help a lot locating the root case. Absolutely. This is on a 50G LVM root volume, so I've been able to leave the original

Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-14 Thread james harvey
On Mon, May 14, 2018 at 2:36 AM, Qu Wenruo wrote: > OK, I could reproduce it now. > > Just mount with -o nodatasum, then create a file. > Remount with compress-force=lzo, then write something. > > So at least btrfs should disallow such thing. > > Thanks, > Qu Would the

Re: [PATCH] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

2018-05-14 Thread james harvey
On Mon, May 14, 2018 at 4:36 AM, Nikolay Borisov wrote: > On 14.05.2018 11:20, Roman Mamedov wrote: >> On Mon, 14 May 2018 11:10:34 +0300 >> Nikolay Borisov wrote: >> >>> But if we have mounted the fs with FORCE_COMPRESS shouldn't we disregard >>> the inode

Re: [PATCH] btrfs: inode: Don't compress if NODATASUM or NODATACOW set

2018-05-14 Thread james harvey
On Mon, May 14, 2018 at 4:20 AM, Roman Mamedov wrote: > On Mon, 14 May 2018 11:10:34 +0300 > Nikolay Borisov wrote: > >> But if we have mounted the fs with FORCE_COMPRESS shouldn't we disregard >> the inode flags, presumably the admin knows what he is doing?

Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-13 Thread james harvey
On Sun, May 13, 2018 at 10:08 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote: > On 2018年05月12日 13:08, james harvey wrote: >> Hardware is fine. Passes memtest86+ in SMP mode. Works fine on all >> other files. >> >> >> >> [ 381.869940

Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-13 Thread james harvey
Reported at: https://bugzilla.kernel.org/show_bug.cgi?id=199707 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-13 Thread james harvey
;li...@colorremedies.com> wrote: > On Sat, May 12, 2018 at 6:10 PM, james harvey <jamespharve...@gmail.com> > wrote: >> Does this mean that although I've never had a corrupted disk bit >> before on COW/checksummed data, one somehow happened on the small >> fraction of my sto

Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-12 Thread james harvey
(Conversation order changed to put program output at bottom) On Sat, May 12, 2018 at 10:09 PM, Chris Murphy <li...@colorremedies.com> wrote: > On Sat, May 12, 2018 at 6:10 PM, james harvey <jamespharve...@gmail.com> > wrote: >> Does this mean that although I've never

Re: "decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-12 Thread james harvey
On Sat, May 12, 2018 at 3:51 AM, Martin Steigerwald <mar...@lichtvoll.de> wrote: > Hey James. > > james harvey - 12.05.18, 07:08: >> 100% reproducible, booting from disk, or even Arch installation ISO. >> Kernel 4.16.7. btrfs-progs v4.16. >> >> Reading one

"decompress failed" in 1-2 files always causes kernel oops, check/scrub pass

2018-05-11 Thread james harvey
100% reproducible, booting from disk, or even Arch installation ISO. Kernel 4.16.7. btrfs-progs v4.16. Reading one of two journalctl files causes a kernel oops. Initially ran into it from "journalctl --list-boots", but cat'ing the file does it too. I believe this shows there's compressed data

Hard reset during balance - How to continue?

2018-03-23 Thread james harvey
If a system unexpectedly reboots during a balance, what's the best next step? >From an ISO, read-only operations/mounting looks fine. Didn't want to make any writes until I ask, because I see some reports of bad things happening with a reboot during a balance, and don't want to mis-step. btrfs

Re: btrfs send extremely slow (almost stuck)

2016-08-28 Thread james harvey
On Sun, Aug 28, 2016 at 12:15 PM, Oliver Freyermuth wrote: > For me, this means I have to stay with rsync backups, which are sadly > incomplete since special FS attrs > like "C" for nocow are not backed up. Should be able to make a script that creates a textfile

Re: Unmountable and unrepairable BTRFS

2016-08-06 Thread james harvey
Depending on how important the data is, wanted to throw out the most prudent first step is to get another set of drives equal to or bigger than the ones of the bad volume, and image them using dd one by one as block devices. That gives you an undo button if recovery attempts go wrong. Always the

WARNING CPU at linux/fs/btrfs/ioctl.c:558 create_subvol BTRFS Transaction aborted (error -2)

2016-01-02 Thread james harvey
Fresh Arch install. Official ISO dated 1/1/2016, using linux 4.3.3-2, btrfs-progs 4.3.1-2, lvm2 2.02.137-1, and thin-provisioning-tools 0.5.6-2. Hard drive installation uses the exact same (only a day after ISO released) and snapper 0.2.8-4. I can re-create this by when booted off the archiso,

Re: WARNING CPU at linux/fs/btrfs/ioctl.c:558 create_subvol BTRFS Transaction aborted (error -2)

2016-01-02 Thread james harvey
Have looked at http://comments.gmane.org/gmane.comp.file-systems.btrfs/48926 which contains a discussion and patch that is related to ioctl.c:558, not sure if it was ever committed, or if it's even the same issue I'm having. On Sat, Jan 2, 2016 at 11:40 PM, james harvey <jamespharve...@gmail.

Re: [PATCH] Btrfs: Intialize btrfs_root->highest_objectid when loading tree root and subvolume roots

2016-01-02 Thread james harvey
Bump. Pretty sure I just ran into this, outside of a testing scenario. See http://permalink.gmane.org/gmane.comp.file-systems.btrfs/51796 Looks like the patch was never committed. On Wed, Oct 7, 2015 at 10:10 AM, Chandan Rajendra wrote: > On Wednesday 07 Oct 2015

Re: corrupt leaf, bad key order & no csum found for inode & csum failed ino & input/output error

2015-12-27 Thread james harvey
XTENT_CSUM 250723954688) itemoff 9055 itemsize 12 extent csum item item 360 key (EXTENT_CSUM EXTENT_CSUM 250723971072) itemoff 9031 itemsize 24 extent csum item On Sun, Dec 27, 2015 at 8:03 PM, james harvey <jamespharve...@gmail.com> wrote: > Have gotten about 300 (mostly duplicate) BTRFS

corrupt lead, bad key order & no csum found for inode & csum failed ino & input/output error

2015-12-27 Thread james harvey
Have gotten about 300 (mostly duplicate) BTRFS errors in the last hours. No signs of disk problem. Non-SSD SATA. Was able to dd the drive into an image file on another drive without errors. Smartctl reports no issues. It is a single disk though. Been running btrfs fine since 7/9/15,

Expected behavior of bad sectors on one drive in a RAID1

2015-10-19 Thread james harvey
Background - My fileserver had a "bad event" last week. Shut it down normally to add a new hard drive, and it would no longer post. Tried about 50 times, doing the typical everything non-essential unplugged, trying 1 of 4 memory modules at a time, and 1 of 2 processors at a time. Got no

Non-RAID1 areas in RAID1, data 8MB, Metadata 8MB, System 4MB

2015-10-19 Thread james harvey
The create RAID1 example illustrates my question, at: https://btrfs.wiki.kernel.org/index.php/UseCases#How_do_I_create_a_RAID1_mirror_in_Btrfs.3F It shows: mkfs.btrfs -m raid1 -d raid1 /dev/sda1 /dev/sdb1 will result in: btrfs fi df /mount Data, RAID1: total=1.00GB, used=128.00KB Data:

N-Way (traditional) RAID-1 development status

2015-10-19 Thread james harvey
Wanted to see if there's active development on N-Way (traditional) RAID-1. By this, I mean that RAID-1 across "n" disks traditionally means "n" copies of data, but btrfs currently implements RAID-1 as "2" copies of data. So, unlike traditional RAID-1, losing 2 drives in a many drive array might

Re: Modifying a file in many snapshots

2015-08-27 Thread james harvey
Is there a way to do this? If not, is it a worthwhile feature request? On Thu, Aug 27, 2015 at 4:01 AM, james harvey jamespharve...@gmail.com wrote: If this isn't possible, is there a way to check a given path/filename on a btrfs filesystem, to show all the other reflinked path/filenames

Modifying a file in many snapshots

2015-08-26 Thread james harvey
I'm using btrfs and snapper. So, I have many periodic btrfs snapshots, which are marked read-only (can of course be changed by btrfs to r/w and changed back.) Let's say during my initial install I set a vimrc. And, let's say now I want to change the vimrc, so that if I go back to any of the

Re: Modifying a file in many snapshots

2015-08-26 Thread james harvey
If this isn't possible, is there a way to check a given path/filename on a btrfs filesystem, to show all the other reflinked path/filenames to the same file? On Thu, Aug 27, 2015 at 3:31 AM, james harvey jamespharve...@gmail.com wrote: I'm using btrfs and snapper. So, I have many periodic btrfs

mkfs.btrfs with invalid -d option uses default single drive, rather than errors

2015-07-28 Thread james harvey
Doing some fast-paced benchmarking of lots of raid levels, some in a kvm, some via RDMA access over InfiniBand using different procotols, etc. Was shocked to see horrible raid0 performance in one of the tests. Looked back through the logs and found: (Note that I typo'ed -d raid**e**0 = $

INFO: task btrfs-transacti:204 blocked for more than 120 seconds. (more like 8+min)

2015-07-23 Thread james harvey
Up to date Arch. linux kernel 4.1.2-2. Fresh O/S install 12 days ago. No where near full - 34G used on a 4.6T drive. 32GB memory. Installed bonnie++ 1.97-1. $ bonnie++ -d bonnie -m btrfs-disk -f -b I started trying to run with a -s 4G option, to use 4GB files for performance measuring. It

btrfs subvolume tree (btrfs-progs feature request)

2015-07-08 Thread james harvey
Request for new btrfs subvolume subcommand: tree path Display a depth indented listing of subvolumes present in the filesystem path. ---or--- list ... --tree ... ... -tree display subvolumes in a depth indented listing. Would (I think): * Display each top-level subvolume *

btrfs subvolume clone or fork (btrfs-progs feature request)

2015-07-08 Thread james harvey
Request for new btrfs subvolume subcommand: clone or fork [-i qgroupid] source [dest]name Create a subvolume name in dest, which is a clone or fork of source. If dest is not given, subvolume name will be created in the current directory. Options -i qgroupid Add the newly created

btrfs subvolume clone or fork (btrfs-progs feature request)

2015-07-08 Thread james harvey
-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html