Re: Symlink not persisted even after fsync

2018-04-15 Thread Theodore Y. Ts'o
On Sun, Apr 15, 2018 at 07:10:52PM -0500, Vijay Chidambaram wrote:
> 
> I don't think this is what the paper's ext3-fast does. All the paper
> says is if you have a file system where the fsync of a file persisted
> only data related to that file, it would increase performance.
> ext3-fast is the name given to such a file system. Note that we do not
> present a design of ext3-fast or analyze it in any detail. In fact, we
> explicitly say "The ext3-fast file system (derived from inferences
> provided by ALICE) seems interesting for application safety, though
> further investigation is required into the validity of its design."

Well, says that it's based on ext3's data=journal "Abstract Persistent
Model".  It's true that a design was not proposed --- but if you
don't propose a design, how do you know what the performance is or
whether it's even practical?  That's one of those things I find
extremely distasteful in the paper.  Sure, I can model a faster than
light interstellar engine ala Star Trek's Warp Drive --- and I can
talk about it having, say, better performance than a reaction drive.
But it doesn't tell us anything useful about whether it can be built,
or whether it's even useful to dream about it.

To me, that part of the paper, really read as, "watch as I wave my
hands around widely, that they never leave the ends of my arms!"

> Thanks! As I mentioned before, this is useful. I have a follow-up
> question. Consider the following workload:
> 
>  creat foo
>  link (foo, A/bar)
>  fsync(foo)
>  crash
> 
> In this case, after the file system recovers, do we expect foo's link
> count to be 2 or 1? I would say 2, but POSIX is silent on this, so
> thought I would confirm. The tricky part here is we are not calling
> fsync() on directory A.
> 
> In this case, its not a symlink; its a hard link, so I would say the
> link count for foo should be 2. But btrfs and F2FS show link count of
> 1 after a crash.

Well, is the link count accurate?  That is to say, does A/bar exist?
I would think that the requirement that the file system be self
consistent is the most important consideration.

Cheers,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Symlink not persisted even after fsync

2018-04-15 Thread Amir Goldstein
On Mon, Apr 16, 2018 at 3:10 AM, Vijay Chidambaram  wrote:
[...]
> Consider the following workload:
>
>  creat foo
>  link (foo, A/bar)
>  fsync(foo)
>  crash
>
> In this case, after the file system recovers, do we expect foo's link
> count to be 2 or 1? I would say 2, but POSIX is silent on this, so
> thought I would confirm. The tricky part here is we are not calling
> fsync() on directory A.
>
> In this case, its not a symlink; its a hard link, so I would say the
> link count for foo should be 2. But btrfs and F2FS show link count of
> 1 after a crash.
>

That sounds like a clear bug - nlink is metadata of inode foo, so
should be made persistent by fsync(foo).

For non-journaled fs you would need to fsync(A) to guarantee
seeing A/bar after crash, but for a journaled fs, if you didn't see
A/bar after crash and did see nlink 2 on foo then you would get
a filesystem inconsistency, so practically, fsync(foo) takes care
of persisting A/bar entry as well. But as you already understand,
these rules have not been formalized by a standard, instead, they
have been "formalized" by various fsck.* tools.

Allow me to suggest a different framing for CrashMonkey.
You seem to be engaging in discussions with the community
about whether X behavior is a bug or not and as you can see
the answer depends on the filesystem (and sometimes on the
developer). Instead, you could declare that CrashMonkey
is a "Certification tool" to certify filesystems to a certain
crash consistency behavior. Then you can discuss with the
community about specific models that CrashMonkey should
be testing. The model describes the implicit dependencies
and ordering guaranties between operations.
Dave has mentioned the "strictly ordered metadata" model.
I do not know of any formal definition of this model for filesystems,
but you can take a shot at starting one and encoding it into
CrashMonkey. This sounds like a great paper to me.

I don't know if Btrfs and f2fs will qualify as "strictly ordered
metadata" and I don't know if they would want to qualify.
Mind you a filesystem can be crash consistent without
following "strictly ordered metadata". In fact, in many cases
"strictly ordered metadata" imposes performance penalty by
coupling together unrelated metadata updates (e.g. create
A/a and create B/b), but it is also quite hard to decouple them
because future operation can create a dependency (e.g.
mv A/a B/b).

Thanks,
Amir.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs: Do super block verification before writing it to disk

2018-04-15 Thread Qu Wenruo
There are already 2 reports about strangely corrupted super blocks,
where csum type and incompat flags get some obvious garbage, but csum
still matches and all other vitals are correct.

This normally means some kernel memory corruption happens, although the
cause is unknown, at least detect it and prevent further corruption.

Signed-off-by: Qu Wenruo 
---
changelog:
v2:
  Fix false alerts by moving the check to write_dev_supers() as
  btrfs_check_super_valid() only handles the primary superblock.
---
 fs/btrfs/disk-io.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 23803102aa0d..69f49f4937ea 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -68,7 +68,8 @@
 static const struct extent_io_ops btree_extent_io_ops;
 static void end_workqueue_fn(struct btrfs_work *work);
 static void free_fs_root(struct btrfs_root *root);
-static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info);
+static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
+  struct btrfs_super_block *sb);
 static void btrfs_destroy_ordered_extents(struct btrfs_root *root);
 static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
  struct btrfs_fs_info *fs_info);
@@ -2680,7 +2681,7 @@ int open_ctree(struct super_block *sb,
 
memcpy(fs_info->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE);
 
-   ret = btrfs_check_super_valid(fs_info);
+   ret = btrfs_check_super_valid(fs_info, fs_info->super_copy);
if (ret) {
btrfs_err(fs_info, "superblock contains fatal errors");
err = -EINVAL;
@@ -3310,6 +3311,26 @@ static int write_dev_supers(struct btrfs_device *device,
 
btrfs_set_super_bytenr(sb, bytenr);
 
+   /* check the validation of the primary sb before writing */
+   if (i == 0) {
+   ret = btrfs_check_super_valid(device->fs_info, sb);
+   if (ret) {
+   btrfs_err(device->fs_info,
+   "superblock corruption detected for device %llu",
+ device->devid);
+   return -EUCLEAN;
+   }
+   /*
+* Unknown incompat flags can't be mounted, so newly
+* developed flags means corruption
+*/
+   if (btrfs_super_incompat_flags(sb) &
+   ~BTRFS_FEATURE_INCOMPAT_SUPP) {
+   btrfs_err(device->fs_info,
+   "fatal superblock corrupted detected");
+   return -EUCLEAN;
+   }
+   }
crc = ~(u32)0;
crc = btrfs_csum_data((const char *)sb + BTRFS_CSUM_SIZE, crc,
  BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE);
@@ -3985,9 +4006,9 @@ int btrfs_read_buffer(struct extent_buffer *buf, u64 
parent_transid, int level,
  level, first_key);
 }
 
-static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info)
+static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
+  struct btrfs_super_block *sb)
 {
-   struct btrfs_super_block *sb = fs_info->super_copy;
u64 nodesize = btrfs_super_nodesize(sb);
u64 sectorsize = btrfs_super_sectorsize(sb);
int ret = 0;
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 10:33 PM, Chris Murphy  wrote:
> On Sun, Apr 15, 2018 at 7:45 PM, Alexander Zapatka
>  wrote:
>> thanks, Chris.  i have given a timeout of 300 to all the drives.  they
>> are all USB, all connected to an apollo lake based htpc.  then i
>> started the command again... the dmesg output is here from a few
>> minutes after i started the btrfs device remove command.
>> https://paste.ee/p/H1R0i.  no hopes, high or low, but i'm still
>> getting the same errors.  i'll let it run though the night tho, as it
>> doesn't seem to hurt anything other then slowly lock the system up.
>>
>> on a side note, all the USB drives are either powered or are connected
>> to a powered hub..  thanks again!
>
> That's 5 minutes. I'd say something is wrong/badly designed if it's
> not giving a clear error message inside of 1 minute but the
> manufacturers have apparently decided upwards of 180. I haven't heard
> of it taking longer than 180, but good grief.

Bad proof reading: That's 180 seconds.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 7:45 PM, Alexander Zapatka
 wrote:
> thanks, Chris.  i have given a timeout of 300 to all the drives.  they
> are all USB, all connected to an apollo lake based htpc.  then i
> started the command again... the dmesg output is here from a few
> minutes after i started the btrfs device remove command.
> https://paste.ee/p/H1R0i.  no hopes, high or low, but i'm still
> getting the same errors.  i'll let it run though the night tho, as it
> doesn't seem to hurt anything other then slowly lock the system up.
>
> on a side note, all the USB drives are either powered or are connected
> to a powered hub..  thanks again!

That's 5 minutes. I'd say something is wrong/badly designed if it's
not giving a clear error message inside of 1 minute but the
manufacturers have apparently decided upwards of 180. I haven't heard
of it taking longer than 180, but good grief.

Anyway, at this point it sounds like it continues indefinitely in this
state and there's no point in doing that. I would not persist in
trying to use device remove until this problem is fixed using scrub as
a confirmation rather than either balance or device removal. Scrub is
faster and it's safer.


>From smartctl, it's /dev/sdc that has the bad sector, and usb 2-3 is
the device being reset, which

[3.921241] usb 2-3: Product: Elements 107C
[3.921243] usb 2-3: Manufacturer: Western Digital
[3.921245] usb 2-3: SerialNumber: 57434334453443414636334E

[3.929353] usb-storage 2-3:1.0: USB Mass Storage device detected
[3.929651] scsi host3: usb-storage 2-3:1.0

[4.994087] sd 3:0:0:0: [sdc] 976746240 4096-byte logical blocks:
(4.00 TB/3.64 TiB)

So the device with bad sector is also the device being reset but even
a 300 second command timer isn't causing the drive to report a read
error, it just hangs instead.

That's not expected.

But also, this is the only device that has a 4096 byte logical sector
size, which probably isn't related unless there's a bug here.

The smart reported LBA 1372896792 for the first error should be a 4096
byte base LBA in that case so the proper command to just toss the data
in this sector and cause firmware remapping if necessary is:

dd if=/dev/zero of=/dev/sdX bs=4096 count=1 seek=1372896792 oflag=direct

Confirm the suspect drive is still in fact /dev/sdc, since that can
change between boots. And of course umount the file system first.
There's no reason to step on 16KiB.

You can try that and then restart the long test from just prior to
that LBA and see if it finishes or stops on another sector.

smartctl -t select,1372896792-max /dev/sdX

Then mount the volume and do a scrub and see if it completes with no errors.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: Do super block verification before writing it to disk

2018-04-15 Thread Qu Wenruo
There are already 2 reports about strangely corrupted super blocks,
where csum type and incompat flags get some obvious garbage, but csum
still matches and all other vitals are correct.

This normally means some kernel memory corruption happens, although the
cause is unknown, at least detect it and prevent further corruption.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/disk-io.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 23803102aa0d..10d814f03f13 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -68,7 +68,8 @@
 static const struct extent_io_ops btree_extent_io_ops;
 static void end_workqueue_fn(struct btrfs_work *work);
 static void free_fs_root(struct btrfs_root *root);
-static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info);
+static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
+  struct btrfs_super_block *sb);
 static void btrfs_destroy_ordered_extents(struct btrfs_root *root);
 static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
  struct btrfs_fs_info *fs_info);
@@ -2680,7 +2681,7 @@ int open_ctree(struct super_block *sb,
 
memcpy(fs_info->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE);
 
-   ret = btrfs_check_super_valid(fs_info);
+   ret = btrfs_check_super_valid(fs_info, fs_info->super_copy);
if (ret) {
btrfs_err(fs_info, "superblock contains fatal errors");
err = -EINVAL;
@@ -3575,6 +3576,21 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int 
max_mirrors)
sb = fs_info->super_for_commit;
dev_item = >dev_item;
 
+   /* Do extra check on the sb to be written */
+   ret = btrfs_check_super_valid(fs_info, sb);
+   if (ret) {
+   btrfs_err(fs_info, "fatal superblock corrupted detected");
+   return -EUCLEAN;
+   }
+   /*
+* Unknown incompat flags can't be mounted, so newly developed flags
+* means corruption
+*/
+   if (btrfs_super_incompat_flags(sb) & ~BTRFS_FEATURE_INCOMPAT_SUPP) {
+   btrfs_err(fs_info, "fatal superblock corrupted detected");
+   return -EUCLEAN;
+   }
+
mutex_lock(_info->fs_devices->device_list_mutex);
head = _info->fs_devices->devices;
max_errors = btrfs_super_num_devices(fs_info->super_copy) - 1;
@@ -3985,9 +4001,9 @@ int btrfs_read_buffer(struct extent_buffer *buf, u64 
parent_transid, int level,
  level, first_key);
 }
 
-static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info)
+static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
+  struct btrfs_super_block *sb)
 {
-   struct btrfs_super_block *sb = fs_info->super_copy;
u64 nodesize = btrfs_super_nodesize(sb);
u64 sectorsize = btrfs_super_sectorsize(sb);
int ret = 0;
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: having issue removing a drive with a bad block

2018-04-15 Thread Alexander Zapatka
thanks, Chris.  i have given a timeout of 300 to all the drives.  they
are all USB, all connected to an apollo lake based htpc.  then i
started the command again... the dmesg output is here from a few
minutes after i started the btrfs device remove command.
https://paste.ee/p/H1R0i.  no hopes, high or low, but i'm still
getting the same errors.  i'll let it run though the night tho, as it
doesn't seem to hurt anything other then slowly lock the system up.

on a side note, all the USB drives are either powered or are connected
to a powered hub..  thanks again!

On Sun, Apr 15, 2018 at 8:52 PM, Chris Murphy  wrote:
> On Sun, Apr 15, 2018 at 6:30 PM, Chris Murphy  wrote:
>
>> # echo value > /sys/block/device-name/device/timeout
>>
>
> Also note that this is not a persistent setting. It needs to be done
> per boot. But before you change it, use cat to find out what the value
> is. Default is 30.
>
> I'm seeing this:
> https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec
>
> Which could bmaybe be adapted from mdadm raid to look for Btrfs
> instead, or in addition to.
>
> --
> Chris Murphy



-- 
 -o)
  /\\Message void if penguin violated
_\_VDon't mess with the penguin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Add udev-md-raid-safe-timeouts.rules

2018-04-15 Thread Chris Murphy
I just ran into this:
https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec

This solution is inadequate, can it be made more generic? This isn't
an md specific problem, it affects Btrfs and LVM as well. And in fact
raid0, and even none raid setups.

There is no good reason to prevent deep recovery, which is what
happens with the default command timer of 30 seconds, with this class
of drive. Basically that value is going to cause data loss for the
single device and also raid0 case, where the reset happens before deep
recovery has a chance. And even if deep recovery fails to return user
data, what we need to see is the proper error message: read error UNC,
rather than a link reset message which just obfuscates the problem.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 6:30 PM, Chris Murphy  wrote:

> # echo value > /sys/block/device-name/device/timeout
>

Also note that this is not a persistent setting. It needs to be done
per boot. But before you change it, use cat to find out what the value
is. Default is 30.

I'm seeing this:
https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec

Which could bmaybe be adapted from mdadm raid to look for Btrfs
instead, or in addition to.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
Please keep the list in the cc:

On Sun, Apr 15, 2018 at 5:55 PM, Alexander Zapatka
 wrote:
> output:
>
> $  sudo smartctl -l scterc /dev/sdc
> smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.0-38-generic] (local build)
> Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
>
> SCT Error Recovery Control command not supported

OK you'll need to increase the scsi command timer to something like
120. Hopefully that works. This needs to be done for each device.


# echo value > /sys/block/device-name/device/timeout

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/scsi-command-timer-device-status


> after the last reboot i haven't done anything to restart the scrub or
> remove the device...   i have a syslog from the last crash, you can
> see it here https://paste.ee/p/R2Pt7.  if is not enough, i will
> certainly start a scrub and let it crash.

>Apr 13 23:53:41 kodbox kernel: [225349.101299] usb 1-1.4.2: reset high-speed 
>USB device number 7 using xhci_hcd

Hmmm, could be there's a power issue. Not sure if it's related to the
problem or not. I see this when I direct connect laptop drives in USB
powered enclosures (no external power) directly to a my Intel NUC, but
then the problem goes away when the drive is connected to a suitably
powered USB hub, which is then connected to the computer.

But it's worth a shot to change the scsi command timer as described
resolves the problem first.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Symlink not persisted even after fsync

2018-04-15 Thread Vijay Chidambaram
Hi Ted,

On Sun, Apr 15, 2018 at 9:13 AM, Theodore Y. Ts'o  wrote:
> On Sat, Apr 14, 2018 at 08:35:45PM -0500, Vijaychidambaram Velayudhan Pillai 
> wrote:
>> I was one of the authors on that paper, and I didn't know until today you
>> didn't like that work :) The paper did *not* suggest we support invented
>> guarantees without considering the performance impact.
>
> I hadn't noticed that you were one of the authors on that paper,
> actually.
>
> The problem with that paper was I don't think the researchers had
> talked to anyone who had actually designed production file systems.
> For example, there are some the hypothetical ext3-fast file system
> proposed in the paper has some real practical problems.  You can't
> just switch between having the file contents being journaled via the
> data=journal mode, and file contents being written via the normal page
> cache mechanisms.  If you don't do some very heavy-weight, performance
> killing special measures, data corruption is a very real possibility.

I don't think this is what the paper's ext3-fast does. All the paper
says is if you have a file system where the fsync of a file persisted
only data related to that file, it would increase performance.
ext3-fast is the name given to such a file system. Note that we do not
present a design of ext3-fast or analyze it in any detail. In fact, we
explicitly say "The ext3-fast file system (derived from inferences
provided by ALICE) seems interesting for application safety, though
further investigation is required into the validity of its design."

> I agree that documenting what behavior applications can depend upon is
> useful.  However, this needs to be done as a conversation --- and a
> negotiation --- between application and file system developers.  (And
> not necessarily just from one operating system, either!  Application
> authors might care about whether they can get robustness guarantees on
> other operationg systems, such as Mac OS X.)  Also, the tradeoffs may
> in some cases probabilities of data loss, and not hard guarantees.
>
> Formal documentation also takes a lot of effort to write.  That's
> probably why no one has tried to formally codify it since POSIX.  We
> do have informal agreements, such as adding an implied data flush
> after certain close or renames operations.  And sometimes these are
> written up, but only informally.  A good example of this is the
> O_PONIES controversy, wherein the negotiations/conversation happened
> on various blog entries, and ultimately at an LSF/MM face-to-face
> meeting:
>
> http://blahg.josefsipek.net/?p=364
> https://sandeen.net/wordpress/uncategorized/coming-clean-on-o_ponies/
> https://lwn.net/Articles/322823/
> https://lwn.net/Articles/327601/
> https://lwn.net/Articles/351422/
>
> Note that the implied file writebacks after certain renames and closes
> (as documented at the end of https://lwn.net/Articles/322823/) was
> implemented for ext4, and then after discussion at LSF/MM, there was
> general agreement across multiple major file system maintainers that
> we should all provide similar behavior.
>
> So doing this kind of standardization, especially if you want to take
> into account all of the stakeholders, takes time and is not easy.  If
> you only take one point of view, you can have what happened with the C
> standard, where the room was packed with compiler authors, who were
> only interested in what kind of cool compiler optimizations they could
> do, and completely ignored whether the resulting standard would
> actually be useful by practicing system programmers.  Which is why the
> Linux kernel is only really supported on gcc, and then with certain
> optimizations allowed by the C standard explicitly turned off.  (Clang
> support is almost there, but not everyone trust a kernel built by
> Clang won't have some subtle, hard-to-debug problems...)

I definitely agree it takes time and effort. I'm hoping our work on
CrashMonkey can help here, by codifying the crash-consistency
guarantees into tests that new file-system developers can use.

>
> Academics could very well have a place in helping to facilitate the
> conversation.  I think my primary concern with the Pillai paper is
> that the authors apparently talked a whole bunch to application
> authors, but not nearly as much to file system developers.

I agree with this criticism. This is why my research group engages
with the file-system community right from project start, as we have
been doing with CrashMonkey.

>> But in any case, coming back to our main question, the conclusion seems to
>> be: symlinks aren't standard, so we shouldn't be studying their
>> crash-consistency properties. This is useful to know. Thanks!
>
> Well, symlinks are standardized.  But what the standards say about
> them is extremely limited.  And the crash-consistency properties you
> were looking at, which is what fsync() being called on a file
> descriptor opened via 

Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 6:14 AM, Alexander Zapatka
 wrote:
> i recently set up a drive pool in single mode on my little media
> server.  about a week later SMART started telling me that the drive
> was having issue and there is one bad sector.  since the array is far
> from full i decided to remove the drive from the pool.  but running
>
> btrfs device remove /dev/sdc /mnt/pool
>
> resulted in a deadlock.  everything crashed, and i had to pull the
> plug to reboot.  once up i did a btrfs check of the drive and it
> reported no issues with the file system...  but running the remove
> again results in a dead lock.  i have tried running a scrub and it
> eventually results in a dead lock also.

What do you get for:

$ sudo smartctl -l scterc

And can you post a complete dmesg somewhere? Chances are this deadlock
is not really a deadlock, the system is hanging because Btrfs keeps
trying to read a bad block, and it's taking the drive so long to
recover that the kernel does a SATA link reset, and then Btrfs tries
to read again and then you get another hang while the drive decides
what to do - etc and it just doesn't end. But we need the dmesg even
if it takes 30 minutes for the dmesg command to complete - it's
probably easiest to do this with ssh remotely so that the dmesg result
when it finally appears is already on another machine and you don't
have to additionally mess around with outputing it to a file and then
getting the file off the hanging machine.

And don't hard reset it. 'sudo reboot -f' should be sufficient and
safe, even if not immediate, it might take a couple minutes for it it
to actually reboot.

What I'm betting is that you've got a mismatch between the kernel's
scsi command timer (defaults to 30 seconds) and the SCT ERC setting
for the drives. If they're consumer drives they either don't support
SCT ERC or it's disabled by default, in either case the recovery can
be well in excess of 30 seconds. So what you have to do is flip that
around so the drive gives up before the kernel. So either the command
timer has to be increased, or the drive SCT ERC value must be
decreased. And hence we need more info as requested above.




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Symlink not persisted even after fsync

2018-04-15 Thread Theodore Y. Ts'o
On Sat, Apr 14, 2018 at 08:35:45PM -0500, Vijaychidambaram Velayudhan Pillai 
wrote:
> I was one of the authors on that paper, and I didn't know until today you
> didn't like that work :) The paper did *not* suggest we support invented
> guarantees without considering the performance impact.

I hadn't noticed that you were one of the authors on that paper,
actually.

The problem with that paper was I don't think the researchers had
talked to anyone who had actually designed production file systems.
For example, there are some the hypothetical ext3-fast file system
proposed in the paper has some real practical problems.  You can't
just switch between having the file contents being journaled via the
data=journal mode, and file contents being written via the normal page
cache mechanisms.  If you don't do some very heavy-weight, performance
killing special measures, data corruption is a very real possibility.

(If you're curious as to why, see the comments in the function
ext4_change_journal_flag() in fs/ext4/inode.c, which is called when
clearing the per-file data journal flag.  We need to stop the journal,
write all dirty, journalled buffers to disk, empty the journal, and
only then can we switch a file from using data journalling to the
normal ordered data mode handling.  Now imagine ext3-fast needing to
do all of this...)

The paper also talked in terms of what file system designers should
consider; it didn't really make the same recommendation to application
authors.  If you look at Table 3(c), which listed application
"vulnerabilities" under current file systems, for the applications
that do purport to provide robustness against crashes (e.g., Postgres,
LMDB, etc.) , most of them actually work quite well, with little or
vulerabilities.  A notable example is Zookeeper --- but that might be
an example where the application is just buggy, and should be fixed.

> I don't disagree with any of this. But you can imagine how this can be all
> be confusing to file-system developers and research groups who work on file
> systems: without formal documentation, what exactly should they test or
> support? Clearly current file systems provide more than just POSIX and
> therefore POSIX itself is not very useful.

I agree that documenting what behavior applications can depend upon is
useful.  However, this needs to be done as a conversation --- and a
negotiation --- between application and file system developers.  (And
not necessarily just from one operating system, either!  Application
authors might care about whether they can get robustness guarantees on
other operationg systems, such as Mac OS X.)  Also, the tradeoffs may
in some cases probabilities of data loss, and not hard guarantees.

Formal documentation also takes a lot of effort to write.  That's
probably why no one has tried to formally codify it since POSIX.  We
do have informal agreements, such as adding an implied data flush
after certain close or renames operations.  And sometimes these are
written up, but only informally.  A good example of this is the
O_PONIES controversy, wherein the negotiations/conversation happened
on various blog entries, and ultimately at an LSF/MM face-to-face
meeting:

http://blahg.josefsipek.net/?p=364
https://sandeen.net/wordpress/uncategorized/coming-clean-on-o_ponies/   
https://lwn.net/Articles/322823/
https://lwn.net/Articles/327601/
https://lwn.net/Articles/351422/

Note that the implied file writebacks after certain renames and closes
(as documented at the end of https://lwn.net/Articles/322823/) was
implemented for ext4, and then after discussion at LSF/MM, there was
general agreement across multiple major file system maintainers that
we should all provide similar behavior.

So doing this kind of standardization, especially if you want to take
into account all of the stakeholders, takes time and is not easy.  If
you only take one point of view, you can have what happened with the C
standard, where the room was packed with compiler authors, who were
only interested in what kind of cool compiler optimizations they could
do, and completely ignored whether the resulting standard would
actually be useful by practicing system programmers.  Which is why the
Linux kernel is only really supported on gcc, and then with certain
optimizations allowed by the C standard explicitly turned off.  (Clang
support is almost there, but not everyone trust a kernel built by
Clang won't have some subtle, hard-to-debug problems...)

Academics could very well have a place in helping to facilitate the
conversation.  I think my primary concern with the Pillai paper is
that the authors apparently talked a whole bunch to application
authors, but not nearly as much to file system developers.

> But in any case, coming back to our main question, the conclusion seems to
> be: symlinks aren't standard, so we shouldn't be studying their
> crash-consistency properties. This is useful to know. Thanks!

Well, 

Re: remounted ro during operation, unmountable since

2018-04-15 Thread Duncan
Qu Wenruo posted on Sat, 14 Apr 2018 22:41:50 +0800 as excerpted:

>> sectorsize        4096
>> nodesize        4096
> 
> Nodesize is not the default 16K, any reason for this?
> (Maybe performance?)
> 
>>> 3) Extra hardware info about your sda
>>>     Things like SMART and hardware model would also help here.

>> Model Family: Samsung based SSDs Device Model: SAMSUNG SSD 830
>> Series
> 
> At least I haven't hear much problem about Samsung SSD, so I don't think
> it's the hardware to blamce. (Unlike Intel 600P)

830 model is a few years old, IIRC (I have 850s, and I think I saw 860s 
out in something I read probably on this list, but am not sure of it).  I 
suspect the filesystem was created with an old enough btrfs-tools that 
the default nodesize was still 4K, either due to older distro, or simply 
due to using the filesystem that long.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Btrfs updates for 4.17, part 2

2018-04-15 Thread David Sterba
Hi,

we have queued a few more fixes (error handling, log replay, softlockup)
and the rest is SPDX update that touches almost all files so the
diffstat is long. The top patch is a fixup for excessive warning and
was not in linux-next but I've tested it locally.

Please pull, thanks.


The following changes since commit 57599c7e7722daf5f8c2dba4b0e4628f5c500771:

  btrfs: lift errors from add_extent_changeset to the callers (2018-03-31 
02:03:25 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
for-4.17-part2-tag

for you to fetch changes up to 5d41be6f702f19f72db816c17175caf9dbdcdfa6:

  btrfs: Only check first key for committed tree blocks (2018-04-13 16:16:15 
+0200)


David Sterba (3):
  btrfs: replace GPL boilerplate by SPDX -- headers
  btrfs: replace GPL boilerplate by SPDX -- sources
  btrfs: add SPDX header to Kconfig

Filipe Manana (1):
  Btrfs: fix loss of prealloc extents past i_size after fsync log replay

Liu Bo (3):
  Btrfs: fix NULL pointer dereference in log_dir_items
  Btrfs: bail out on error during replay_dir_deletes
  Btrfs: clean up resources during umount after trans is aborted

Nikolay Borisov (1):
  btrfs: Fix possible softlock on single core machines

Qu Wenruo (1):
  btrfs: Only check first key for committed tree blocks

 fs/btrfs/Kconfig   |  2 +
 fs/btrfs/acl.c | 15 +-
 fs/btrfs/async-thread.c| 15 +-
 fs/btrfs/async-thread.h| 21 ++--
 fs/btrfs/backref.c | 15 +-
 fs/btrfs/backref.h | 19 ++--
 fs/btrfs/btrfs_inode.h | 19 ++--
 fs/btrfs/check-integrity.c | 15 +-
 fs/btrfs/check-integrity.h | 19 ++--
 fs/btrfs/compression.c | 15 +-
 fs/btrfs/compression.h | 19 ++--
 fs/btrfs/ctree.c   | 15 +-
 fs/btrfs/ctree.h   | 20 ++--
 fs/btrfs/dedupe.h  | 20 ++--
 fs/btrfs/delayed-inode.c   | 15 +-
 fs/btrfs/delayed-inode.h   | 19 ++--
 fs/btrfs/delayed-ref.c | 15 +-
 fs/btrfs/delayed-ref.h | 21 ++--
 fs/btrfs/dev-replace.c | 16 +-
 fs/btrfs/dev-replace.h | 20 ++--
 fs/btrfs/dir-item.c| 15 +-
 fs/btrfs/disk-io.c | 26 +-
 fs/btrfs/disk-io.h | 20 ++--
 fs/btrfs/export.c  |  1 +
 fs/btrfs/export.h  |  1 +
 fs/btrfs/extent-tree.c | 17 ++-
 fs/btrfs/extent_io.c   |  1 +
 fs/btrfs/extent_io.h   |  6 ++-
 fs/btrfs/extent_map.c  |  1 +
 fs/btrfs/extent_map.h  |  6 ++-
 fs/btrfs/file-item.c   | 15 +-
 fs/btrfs/file.c| 15 +-
 fs/btrfs/free-space-cache.c| 15 +-
 fs/btrfs/free-space-cache.h| 19 ++--
 fs/btrfs/free-space-tree.c | 15 +-
 fs/btrfs/free-space-tree.h | 19 ++--
 fs/btrfs/inode-item.c  | 15 +-
 fs/btrfs/inode-map.c   | 15 +-
 fs/btrfs/inode-map.h   |  5 +-
 fs/btrfs/inode.c   | 15 +-
 fs/btrfs/ioctl.c   | 15 +-
 fs/btrfs/locking.c | 16 +-
 fs/btrfs/locking.h | 19 ++--
 fs/btrfs/lzo.c | 15 +-
 fs/btrfs/math.h| 20 ++--
 fs/btrfs/ordered-data.c| 15 +-
 fs/btrfs/ordered-data.h| 20 ++--
 fs/btrfs/orphan.c  | 15 +-
 fs/btrfs/print-tree.c  | 15 +-
 fs/btrfs/print-tree.h  | 21 ++--
 fs/btrfs/props.c   | 15 +-
 fs/btrfs/props.h   | 19 ++--
 fs/btrfs/qgroup.c  | 15 +-
 fs/btrfs/qgroup.h  | 22 ++---
 fs/btrfs/raid56.c  | 16 +-
 fs/btrfs/raid56.h  | 21 ++--
 fs/btrfs/rcu-string.h  | 20 +++-
 fs/btrfs/reada.c   | 15 +-
 fs/btrfs/ref-verify.c  | 15 +-
 fs/btrfs/ref-verify.h  | 23 +++--
 fs/btrfs/relocation.c  | 15 +-
 fs/btrfs/root-tree.c   | 15 +-
 fs/btrfs/scrub.c   | 15 +-
 fs/btrfs/send.c| 15 +-
 fs/btrfs/send.h| 20 +++-
 fs/btrfs/struct-funcs.c| 15 +-
 fs/btrfs/super.c   | 15 +-
 

having issue removing a drive with a bad block

2018-04-15 Thread Alexander Zapatka
i recently set up a drive pool in single mode on my little media
server.  about a week later SMART started telling me that the drive
was having issue and there is one bad sector.  since the array is far
from full i decided to remove the drive from the pool.  but running

btrfs device remove /dev/sdc /mnt/pool

resulted in a deadlock.  everything crashed, and i had to pull the
plug to reboot.  once up i did a btrfs check of the drive and it
reported no issues with the file system...  but running the remove
again results in a dead lock.  i have tried running a scrub and it
eventually results in a dead lock also.

my system:

Linux kodbox 4.13.0-38-generic #43-Ubuntu SMP Wed Mar 14 15:20:44 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux

 btrfs version:

btrfs-progs v4.12

though i have also compiled the latest btrfs from git and i get the
same results.

btrfs fi sh /mnt/pool

Label: none  uuid: cb30a848-1882-4f7f-aae1-1533f52d8783
Total devices 4 FS bytes used 6.73TiB
devid1 size 7.28TiB used 5.67TiB path /dev/sdb1
devid2 size 1.82TiB used 219.00GiB path /dev/sdd1
devid3 size 1.82TiB used 220.03GiB path /dev/sda1
devid4 size 3.64TiB used 655.00GiB path /dev/sdc1

btrfs fi us /mnt/pool

Overall:
Device size:  14.55TiB
Device allocated:  6.74TiB
Device unallocated:7.81TiB
Device missing:  0.00B
Used:  6.74TiB
Free (estimated):  7.81TiB  (min: 3.91TiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:6.73TiB, Used:6.72TiB
   /dev/sda1 216.00GiB
   /dev/sdb1   5.67TiB
   /dev/sdc1 655.00GiB
   /dev/sdd1 215.00GiB

Metadata,RAID1: Size:8.00GiB, Used:7.25GiB
   /dev/sda1   4.00GiB

   /dev/sdb1   8.00GiB
   /dev/sdd1   4.00GiB

System,RAID1: Size:32.00MiB, Used:768.00KiB
   /dev/sda1  32.00MiB
   /dev/sdb1  32.00MiB

Unallocated:
   /dev/sda1   1.60TiB
   /dev/sdb1   1.60TiB
   /dev/sdc1   3.00TiB
   /dev/sdd1   1.60TiB


smartmontools tells me this:

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline   Completed: read failure   90% 11625
  1372896792
# 2  Short offline   Completed: read failure   90% 11571
  1372896792
# 3  Extended offlineCompleted: read failure   90% 11562
  1372896792
# 4  Extended offlineCompleted without error   00% 11243 -
# 5  Short offline   Completed without error   00% 10702 -
# 6  Conveyance offline  Completed without error   00% 10702 -

can i just dd if=zero of=/dev/sdc count=1 bs=16k seek=42903024

(i get the location by calculating lba * sector size / block size, so
1372896792*512/16384

which should cause the drive to correctly reallocate the sector.  i think.

anyway, the data is not super important, but i'd like to rescue it if
i can.  is there a way to get data off of one device manually and put
it on another device?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: remounted ro during operation, unmountable since

2018-04-15 Thread Qu Wenruo


On 2018年04月15日 16:39, Timo Nentwig wrote:
> On 04/14/2018 03:45 PM, Timo Nentwig wrote:
>> On 04/14/2018 11:42 AM, Qu Wenruo wrote:
>>> And the work load when the RO happens is also helpful.
>>> (Well, the dmesg of RO happens would be the best though)
>> I had a glance at dmesg but don't remember anything specific (think
>> the usual " [cut here] ---" + dump of registers, but I'm not even
>> sure about that). Sorry. 
> 
> *cough* look, what I just found :)
> 
[snip]
> [Sun Apr  8 09:40:45 2018] CPU: 4 PID: 64370 Comm: kworker/u256:87
> Tainted: G    W  O 4.15.15-1-ARCH #1
> [Sun Apr  8 09:40:45 2018] Hardware name: System manufacturer System
> Product Name/ROG ZENITH EXTREME, BIOS 0902 12/21/2017

ROG, no wonder you're OCing.

> [Sun Apr  8 09:40:45 2018] Workqueue: btrfs-extent-refs
> btrfs_extent_refs_helper [btrfs]
> [Sun Apr  8 09:40:45 2018] RIP: 0010:btrfs_run_delayed_refs+0x167/0x1b0
> [btrfs]
> [Sun Apr  8 09:40:45 2018] RSP: 0018:a085e34d3dd8 EFLAGS: 00010282
> [Sun Apr  8 09:40:45 2018] RAX:  RBX: 96b4f02ab9c8
> RCX: 0001
> [Sun Apr  8 09:40:45 2018] RDX: 8001 RSI: a1e47fcf
> RDI: 
> [Sun Apr  8 09:40:45 2018] RBP: 96b92514 R08: 0001
> R09: 0a02
> [Sun Apr  8 09:40:45 2018] R10: 0005 R11: 
> R12: 96b915a5ec00
> [Sun Apr  8 09:40:45 2018] R13:  R14: 96b4e187c600
> R15: 96b92c411400
> [Sun Apr  8 09:40:45 2018] FS:  ()
> GS:96b92cb0() knlGS:
> [Sun Apr  8 09:40:45 2018] CS:  0010 DS:  ES:  CR0:
> 80050033
> [Sun Apr  8 09:40:45 2018] CR2: 7f2c661a96f0 CR3: 000f48fcc000
> CR4: 003406e0
> [Sun Apr  8 09:40:45 2018] Call Trace:
> [Sun Apr  8 09:40:45 2018]  delayed_ref_async_start+0x8d/0xa0 [btrfs]
> [Sun Apr  8 09:40:45 2018]  normal_work_helper+0x39/0x370 [btrfs]
> [Sun Apr  8 09:40:45 2018]  process_one_work+0x1ce/0x410
> [Sun Apr  8 09:40:45 2018]  worker_thread+0x2b/0x3d0
> [Sun Apr  8 09:40:45 2018]  ? process_one_work+0x410/0x410
> [Sun Apr  8 09:40:45 2018]  kthread+0x113/0x130
> [Sun Apr  8 09:40:45 2018]  ? kthread_create_on_node+0x70/0x70
> [Sun Apr  8 09:40:45 2018]  ret_from_fork+0x22/0x40
> [Sun Apr  8 09:40:45 2018] Code: a0 82 d8 e0 eb 90 48 8b 53 60 f0 0f ba
> aa 50 12 00 00 02 72 1b 83 f8 fb 74 37 89 c6 48 c7 c7 c8 28 53 c0 89 04
> 24 e8 c9 fa be e0 <0f> 0b 8b 04 24 89 c1 ba 04 0c 00 00 48 c7 c6 00 b9
> 52 c0 48 89
> [Sun Apr  8 09:40:45 2018] ---[ end trace 6ad220910a160dd3 ]---
> [Sun Apr  8 09:40:45 2018] BTRFS: error (device sda2) in
> btrfs_run_delayed_refs:3076: errno=-17 Object already exists

According to the code, it's just transaction abort.
I'd say it's just extent tree code hit something unexpected, maybe it's
caused by the offending tree blocks.

Not much useful info compared to debug tree grep.

Thanks,
Qu

> [Sun Apr  8 09:40:45 2018] BTRFS info (device sda2): forced readonly
> [Sun Apr  8 09:40:45 2018] BTRFS error (device sda2): pending csums is
> 331776
> 
> 
> [Tue Apr 10 05:23:22 2018] hid-generic 0003:1E71:170E.0009:
> hiddev0,hidraw0: USB HID v1.10 Device [NZXT.-Inc. NZXT USB Device] on
> usb-:01:00.0-9/input0
> [Tue Apr 10 06:19:31 2018] [ cut here ]
> [Tue Apr 10 06:19:31 2018] BTRFS: Transaction aborted (error -17)
> [Tue Apr 10 06:19:31 2018] WARNING: CPU: 18 PID: 541 at
> fs/btrfs/extent-tree.c:3076 btrfs_run_delayed_refs+0x167/0x1b0 [btrfs]
> [Tue Apr 10 06:19:31 2018] Modules linked in: cmac md4 nls_utf8 cifs ccm
> dns_resolver fscache lz4 lz4_compress zram rfcomm bnep xt_tcpudp
> iptable_filter hwmon_vid msr arc4 ext4 mbcache jbd2 fscrypto btusb
> edac_mce_amd btrtl btbcm btintel ath10k_pci kvm bluetooth ath10k_core
> mousedev ecdh_generic ath irqbypass crc16 input_leds crct10dif_pclmul
> mac80211 snd_hda_codec_realtek crc32_pclmul ghash_clmulni_intel
> snd_hda_codec_generic snd_hda_codec_hdmi pcbc eeepc_wmi wil6210
> aesni_intel igb asus_wmi snd_hda_intel aes_x86_64 sparse_keymap
> led_class crypto_simd snd_hda_codec cfg80211 glue_helper ptp wmi_bmof
> cryptd snd_hda_core pps_core mxm_wmi dca snd_hwdep sp5100_tco ccp rfkill
> snd_pcm pcspkr atlantic i2c_piix4 rng_core k10temp rtc_cmos evdev shpchp
> gpio_amdpt wmi pinctrl_amd mac_hid acpi_cpufreq vboxnetflt(O) vboxnetadp(O)
> [Tue Apr 10 06:19:31 2018]  vboxpci(O) vboxdrv(O) snd_seq_dummy
> snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_timer snd
> soundcore cuse fuse cpufreq_powersave cpufreq_ondemand crypto_user
> ip_tables x_tables sd_mod hid_apple uas usb_storage hid_saitek
> hid_generic usbhid hid ahci xhci_pci libahci xhci_hcd libata usbcore
> scsi_mod usb_common crc32c_generic crc32c_intel btrfs xor
> zstd_decompress zstd_compress xxhash raid6_pq amdgpu chash i2c_algo_bit
> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
> agpgart
> [Tue Apr 10 06:19:31 2018] CPU: 18 PID: 541 Comm: 

Re: remounted ro during operation, unmountable since

2018-04-15 Thread Timo Nentwig

On 04/14/2018 03:45 PM, Timo Nentwig wrote:

On 04/14/2018 11:42 AM, Qu Wenruo wrote:

And the work load when the RO happens is also helpful.
(Well, the dmesg of RO happens would be the best though)
I had a glance at dmesg but don't remember anything specific (think 
the usual " [cut here] ---" + dump of registers, but I'm not even 
sure about that). Sorry. 


*cough* look, what I just found :)

[Sun Apr  8 09:40:45 2018] [ cut here ]
[Sun Apr  8 09:40:45 2018] BTRFS: Transaction aborted (error -17)
[Sun Apr  8 09:40:45 2018] WARNING: CPU: 4 PID: 64370 at 
fs/btrfs/extent-tree.c:3076 btrfs_run_delayed_refs+0x167/0x1b0 [btrfs]
[Sun Apr  8 09:40:45 2018] Modules linked in: cmac md4 nls_utf8 cifs ccm 
dns_resolver fscache lz4 lz4_compress zram rfcomm bnep xt_tcpudp 
hwmon_vid msr iptable_filter arc4 ext4 mbcache jbd2 fscrypto btusb 
edac_mce_amd
 btrtl snd_hda_codec_hdmi btbcm btintel kvm bluetooth ath10k_pci 
mousedev irqbypass ath10k_core ecdh_generic crc16 input_leds 
crct10dif_pclmul crc32_pclmul ath ghash_clmulni_intel 
snd_hda_codec_realtek pcbc mac80211 s
nd_hda_codec_generic aesni_intel wil6210 aes_x86_64 crypto_simd 
snd_hda_intel igb glue_helper cryptd snd_hda_codec cfg80211 eeepc_wmi 
ptp snd_hda_core asus_wmi pps_core ccp sparse_keymap sp5100_tco wmi_bmof 
snd_hwdep
dca rng_core pcspkr atlantic rfkill snd_pcm i2c_piix4 k10temp shpchp 
rtc_cmos evdev gpio_amdpt pinctrl_amd mac_hid acpi_cpufreq vboxnetflt(O) 
vboxnetadp(O) vboxpci(O) vboxdrv(O)
[Sun Apr  8 09:40:45 2018]  snd_seq_dummy snd_seq_oss snd_seq_midi_event 
snd_seq snd_seq_device snd_timer snd soundcore cuse fuse 
cpufreq_powersave cpufreq_ondemand crypto_user ip_tables x_tables sd_mod 
hid_apple uas
usb_storage hid_saitek hid_generic usbhid hid ahci xhci_pci libahci 
xhci_hcd libata usbcore scsi_mod usb_common crc32c_generic crc32c_intel 
btrfs xor zstd_decompress zstd_compress xxhash raid6_pq nouveau 
led_class mxm
_wmi wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops ttm drm agpgart
[Sun Apr  8 09:40:45 2018] CPU: 4 PID: 64370 Comm: kworker/u256:87 
Tainted: G    W  O 4.15.15-1-ARCH #1
[Sun Apr  8 09:40:45 2018] Hardware name: System manufacturer System 
Product Name/ROG ZENITH EXTREME, BIOS 0902 12/21/2017
[Sun Apr  8 09:40:45 2018] Workqueue: btrfs-extent-refs 
btrfs_extent_refs_helper [btrfs]
[Sun Apr  8 09:40:45 2018] RIP: 0010:btrfs_run_delayed_refs+0x167/0x1b0 
[btrfs]

[Sun Apr  8 09:40:45 2018] RSP: 0018:a085e34d3dd8 EFLAGS: 00010282
[Sun Apr  8 09:40:45 2018] RAX:  RBX: 96b4f02ab9c8 
RCX: 0001
[Sun Apr  8 09:40:45 2018] RDX: 8001 RSI: a1e47fcf 
RDI: 
[Sun Apr  8 09:40:45 2018] RBP: 96b92514 R08: 0001 
R09: 0a02
[Sun Apr  8 09:40:45 2018] R10: 0005 R11:  
R12: 96b915a5ec00
[Sun Apr  8 09:40:45 2018] R13:  R14: 96b4e187c600 
R15: 96b92c411400
[Sun Apr  8 09:40:45 2018] FS:  () 
GS:96b92cb0() knlGS:

[Sun Apr  8 09:40:45 2018] CS:  0010 DS:  ES:  CR0: 80050033
[Sun Apr  8 09:40:45 2018] CR2: 7f2c661a96f0 CR3: 000f48fcc000 
CR4: 003406e0

[Sun Apr  8 09:40:45 2018] Call Trace:
[Sun Apr  8 09:40:45 2018]  delayed_ref_async_start+0x8d/0xa0 [btrfs]
[Sun Apr  8 09:40:45 2018]  normal_work_helper+0x39/0x370 [btrfs]
[Sun Apr  8 09:40:45 2018]  process_one_work+0x1ce/0x410
[Sun Apr  8 09:40:45 2018]  worker_thread+0x2b/0x3d0
[Sun Apr  8 09:40:45 2018]  ? process_one_work+0x410/0x410
[Sun Apr  8 09:40:45 2018]  kthread+0x113/0x130
[Sun Apr  8 09:40:45 2018]  ? kthread_create_on_node+0x70/0x70
[Sun Apr  8 09:40:45 2018]  ret_from_fork+0x22/0x40
[Sun Apr  8 09:40:45 2018] Code: a0 82 d8 e0 eb 90 48 8b 53 60 f0 0f ba 
aa 50 12 00 00 02 72 1b 83 f8 fb 74 37 89 c6 48 c7 c7 c8 28 53 c0 89 04 
24 e8 c9 fa be e0 <0f> 0b 8b 04 24 89 c1 ba 04 0c 00 00 48 c7 c6 00 b9 
52 c0 48 89

[Sun Apr  8 09:40:45 2018] ---[ end trace 6ad220910a160dd3 ]---
[Sun Apr  8 09:40:45 2018] BTRFS: error (device sda2) in 
btrfs_run_delayed_refs:3076: errno=-17 Object already exists

[Sun Apr  8 09:40:45 2018] BTRFS info (device sda2): forced readonly
[Sun Apr  8 09:40:45 2018] BTRFS error (device sda2): pending csums is 
331776



[Tue Apr 10 05:23:22 2018] hid-generic 0003:1E71:170E.0009: 
hiddev0,hidraw0: USB HID v1.10 Device [NZXT.-Inc. NZXT USB Device] on 
usb-:01:00.0-9/input0

[Tue Apr 10 06:19:31 2018] [ cut here ]
[Tue Apr 10 06:19:31 2018] BTRFS: Transaction aborted (error -17)
[Tue Apr 10 06:19:31 2018] WARNING: CPU: 18 PID: 541 at 
fs/btrfs/extent-tree.c:3076 btrfs_run_delayed_refs+0x167/0x1b0 [btrfs]
[Tue Apr 10 06:19:31 2018] Modules linked in: cmac md4 nls_utf8 cifs ccm 
dns_resolver fscache lz4 lz4_compress zram rfcomm bnep xt_tcpudp 
iptable_filter hwmon_vid msr arc4 ext4 mbcache jbd2 fscrypto btusb 
edac_mce_amd btrtl btbcm