Re: Upgrade to 3.19.2 Kernel fails to boot

2015-03-31 Thread Anand Jain


Eric found something like this and has a fix with in the email.
Sub: "I think "btrfs: fix leak of path in btrfs_find_item" broke stable 
trees ..."


Anand

On 03/24/2015 06:40 PM, Rich Freeman wrote:

On Tue, Mar 24, 2015 at 2:31 AM, Anand Jain  wrote:

Do you have this fix ..

  [PATCH] Btrfs: release path before starting transaction in can_nocow_extent

could you try ?.


I believe I already have this patch.  3.18.9 contains this:

commit bdeeab62a611f1f7cd48fd285ce568e8dcd0455a
Merge: 797afdf 1bda19e
Author: Linus Torvalds 
Date:   Fri Oct 18 16:46:21 2013 -0700

 Merge branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

 Pull btrfs fix from Chris Mason:
  "Sage hit a deadlock with ceph on btrfs, and Josef tracked it down to a
   regression in our initial rc1 pull.  When doing nocow writes we were
   sometimes starting a transaction with locks held"

 * 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
   Btrfs: release path before starting transaction in can_nocow_extent
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: add debugfs file to test transaction aborts

2015-03-31 Thread Anand Jain




+bool debugfs_abort_transaction(struct btrfs_fs_info *fs_info)
+{
+   if (!btrfs_debugfs_label_trans_abort[0])
+   return false;
+   return strcmp(fs_info->super_copy->label,
+ btrfs_debugfs_label_trans_abort) == 0;
+}
+


 label is not mandatory to be present.

 did I missing something ?

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: prevent deletion of mounted subvolumes

2015-03-31 Thread Eric W. Biederman
Omar Sandoval  writes:

> On Mon, Mar 30, 2015 at 02:30:34PM +0200, David Sterba wrote:
>> On Mon, Mar 30, 2015 at 02:02:17AM -0700, Omar Sandoval wrote:
>> > Before commit bafc9b754f75 ("vfs: More precise tests in d_invalidate"),
>> > d_invalidate() could return -EBUSY when a dentry for a directory had
>> > more than one reference to it. This is what prevented a mounted
>> > subvolume from being deleted, as struct vfsmount holds a reference to
>> > the subvolume dentry. However, that commit removed that case, and later
>> > commits in that patch series removed the return code from d_invalidate()
>> > completely, so we don't get that check for free anymore. So, reintroduce
>> > it in btrfs_ioctl_snap_destroy().
>> 
>> > This applies to 4.0-rc6. To be honest, I'm not sure that this is the most
>> > correct fix for this bug, but it's equivalent to the pre-3.18 behavior and 
>> > it's
>> > the best that I could come up with. Thoughts?
>> 
>> > +  spin_lock(&dentry->d_lock);
>> > +  err = dentry->d_lockref.count > 1 ? -EBUSY : 0;
>> > +  spin_unlock(&dentry->d_lock);
>> 
>> The fix restores the original behaviour, but I don't think opencoding and
>> using internals is fine. Either there should be a vfs api for that or
>> there's an existing one that can be used instead.

I have a problem with restoring the original behavior as is.

In some sense it re-introduces the security issue that the d_invalidate
changes were built to fix.

Any user in the system can create a user namespace, create a mount
namespace and keep any subvolume pinned forever.  Which at the very
least could make a very nice DOS attack.  I am not familiar enough with
how people use subvolumes and 

So let me ask.  How can userspace not know that a subvolume that they
want to delete is already mounted?

I can see having something like is_local_mount_root and denying the
subvolume destruction if the mount that is pinning it is in your local
mount namespace.  


>> The bug here seems defined up to the point that we're trying to delete a
>> subvolume that's a mountpoint. My next guess is that a check
>> 
>>  if (d_mountpoint(&dentry)) { ... }
>> 
>> could work.
>
> That was my first instinct as well, but d_mountpoint() is true for
> dentries that have a filesystem mounted on them (e.g., after mount
> /dev/sda1 /mnt, the dentry for "/mnt"), not the dentry that is mounted.
>
> I poked around the mount code for awhile and couldn't come up with
> anything using the existing interface. Mounting subvolumes bubbles down
> to mount_subtree(), which doesn't really leave any traces of which
> subvolume is mounted except for the dentry in struct vfsmount.
>
> (As far as I can tell, under the covers subvolume deletion is more or
> less equivalent to an rm -rf, and we obviously don't do anything to stop
> users from doing that on the root of their mounted filesystem, but it
> appears that users expect the original behavior.)
>
> Here's an idea: mark mount root dentries as such in the VFS and check it
> in the Btrfs code. Adding fsdevel ML for comments
> (https://lkml.org/lkml/2015/3/30/125 is the original message).

Marking root dentries is needed to fix the bug that you can escape
the limitations of loopback mounts with a carefully placed rename.

I have a patch cooking that marks mountpoints and tracks all of the
mounts on a dentry.  So except for the possibility of stepping on each
others toes I have no objections.

Eric

> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 74609b9..8a0933d 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -2384,6 +2384,11 @@ static noinline int btrfs_ioctl_snap_destroy(struct 
> file *file,
>   goto out_dput;
>   }
>  
> + if (d_is_mount_root(dentry)) {
> + err = -EBUSY;
> + goto out_dput;
> + }
> +
>   mutex_lock(&inode->i_mutex);
>  
>   /*
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 82ef140..a28ca15 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -920,6 +920,10 @@ vfs_kern_mount(struct file_system_type *type, int flags, 
> const char *name, void
>   return ERR_CAST(root);
>   }
>  
> + spin_lock(&root->d_lock);
> + root->d_flags |= DCACHE_MOUNT_ROOT;
> + spin_unlock(&root->d_lock);
> +
>   mnt->mnt.mnt_root = root;
>   mnt->mnt.mnt_sb = root->d_sb;
>   mnt->mnt_mountpoint = mnt->mnt.mnt_root;
> @@ -1017,6 +1021,8 @@ static struct mount *clone_mnt(struct mount *old, 
> struct dentry *root,
>  
>  static void cleanup_mnt(struct mount *mnt)
>  {
> + struct dentry *root = mnt->mnt.mnt_root;
> +
>   /*
>* This probably indicates that somebody messed
>* up a mnt_want/drop_write() pair.  If this
> @@ -1031,7 +1037,10 @@ static void cleanup_mnt(struct mount *mnt)
>   if (unlikely(mnt->mnt_pins.first))
>   mnt_pin_kill(mnt);
>   fsnotify_vfsmount_delete(&mnt->mnt);
> - dput(mnt->mnt.mnt_root);
> + spin_lock(&root->d_lock);
> + root

Python pybtrfs df wrapper script to report btrfs metadata, block, space in df compatible output

2015-03-31 Thread Matt Grant
Hi!

Use this at work.  Releasing to list to prompt design of output that is
100% df format compatible for automated reporting and graphing.

This is so BTRFS space statistics acn bre reported back to existing
graphing tool chains that many of you would have in place.  Quite useful
for munin, hint, hint :-)

URL for github is:

https://github.com/grantma/pybtrfs.git

There is also a shell script there that can be called from cron.

Please get back to me if you have any questions.

Standard not warranted disclaimers apply to the code.  Its GPLv3
licensed.

Best Regards,

Matt Grant

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: readdir tree/page lock inversion

2015-03-31 Thread Chris Mason

On Tue, Mar 31, 2015 at 7:45 PM, Zach Brown  wrote:

We've known for eons that it's not great that readdir holds tree locks
while calling mkwrite by way of dir_emit().  I fiddled around and 
found

a reliable if goofy deadlock reproducer.  It typically takes a few
seconds on modest hardware here (single package, dual core/ht, single
spindle.)

I made a quick attempt at refactoring readdir to stage entries in a
locally allocated page before releasing the locks and emitting them 
but

unfortunately my eyeballs fell out and one of the cats ran off with
them.  Maybe someone else will have more luck.


Ok, copying into a temporary page (or just a temp extent buffer) really 
is the best way.  Basically copy out the extent buffer right after the 
btrfs_search_slot and then process the local copy.


I'll take a look for the merge window.

-chris



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 5:54 PM, Martin Langhoff
 wrote:
> On Tue, Mar 31, 2015 at 4:09 PM, Chris Murphy  wrote:
>> There should be a reference to an rdsosreport.txt in /run/... so find
>> a way to get that posted somewhere.
>
> I'll try, but it truly says nothing of interest from a block device /
> btrfs PoV. I have ample background debugging boot issues, disk
> corruption, etc from years of work w OLPC.

If there is no reference in this dracut shell to rdsosreport.txt, then use:
journalctl -b -l -o short-monotonic

You can mount anything at /sysroot including the boot partition if you
want, or a USB stick. The usual directories aren't available in the
initramfs before switchroot happens.


> That's a good idea! I was referring to something else -- I guess what
> I'm trying to say is: I'm not sure if this scrambled disk partition is
> a btrfs/kernel bug, or the cheap HDD lied about flushing a write to
> disk.

This is the realm of both esoteric knowledge and an active area of
research how to get reliable information about "what happened when the
power cut out." So you're not the only one not sure.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Martin Langhoff
On Tue, Mar 31, 2015 at 4:11 PM, Chris Murphy  wrote:
> While you're at it, try to mount the Btrfs volume in question normally
> and report kernel messages. If mount fails, try it with -o recovery
> mount option, and also report kernel messages and whether that fails.

Oh, I should have mentioned this -- in the context of the
initramfs/systemd diagnostic shell (which is single-user), it just
hangs. No messages.

I'll get a bootable usb going and try under that.

cheers,



m
-- 
 martin.langh...@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Martin Langhoff
On Tue, Mar 31, 2015 at 4:09 PM, Chris Murphy  wrote:
> There should be a reference to an rdsosreport.txt in /run/... so find
> a way to get that posted somewhere.

I'll try, but it truly says nothing of interest from a block device /
btrfs PoV. I have ample background debugging boot issues, disk
corruption, etc from years of work w OLPC.

>>  - kernel is 3.1.9-200.fc21
>
> This is probably 3.18.9, which is the current F21 kernel.

Correct, thanks. I typo'd that.

>>A failure of the
>> HDD cannot be ruled out, low power conditions, cheap consumer part...
>
> Well you have to rule that out before anyone on this list can really
> help. Try booting Fedora 21 install media, and using smartctl -x on
> the drive.

That's a good idea! I was referring to something else -- I guess what
I'm trying to say is: I'm not sure if this scrambled disk partition is
a btrfs/kernel bug, or the cheap HDD lied about flushing a write to
disk.

cheers,


m
-- 
 martin.langh...@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


readdir tree/page lock inversion

2015-03-31 Thread Zach Brown
We've known for eons that it's not great that readdir holds tree locks
while calling mkwrite by way of dir_emit().  I fiddled around and found
a reliable if goofy deadlock reproducer.  It typically takes a few
seconds on modest hardware here (single package, dual core/ht, single
spindle.)

I made a quick attempt at refactoring readdir to stage entries in a
locally allocated page before releasing the locks and emitting them but
unfortunately my eyeballs fell out and one of the cats ran off with
them.  Maybe someone else will have more luck.

- z

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define LEN 300
static char junk[LEN];

int main(int argc, char **argv)
{
int dir_fd;
void *ptr;
pid_t pid;
int fd;
int ret;

/* create our own convenient dir and file for convenience */
fd = open("./file", O_RDWR | O_CREAT, 0644);
if (fd < 0) {
perror("open file");
exit(1);
}

if (mkdir("./dir", 0755)) {
perror("mkdir");
exit(1);
}

dir_fd = open("./dir", O_RDONLY);
if (dir_fd < 0) {
perror("open dir");
exit(1);
}

pid = fork();
if (pid < 0) {
perror("fork");
exit(1);
}

if (!pid) {
/* first child spins modifying entries in the dir */
if (chdir("./dir")) {
perror("chdir");
exit(1);
}

for (;;) {
if (link("../file", "./link")) {
perror("link");
exit(1);
}
if (unlink("./link")) {
perror("link");
exit(1);
}
}
}

pid = fork();
if (pid < 0) {
perror("fork");
exit(1);
}

if (!pid) {
/* second child spins reading entries into mapped file */
ret = write(fd, junk, LEN);
if (ret != LEN) {
perror("write");
exit(1);
}

ptr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED,
   fd, 0);
if (ptr == MAP_FAILED) {
perror("mmap");
exit(1);
}

for(;;) {
if (posix_fadvise(fd, 0, 4096, POSIX_FADV_DONTNEED)) {
perror("fadvise");
exit(1);
}

if (lseek(dir_fd, 0, SEEK_SET)) {
perror("lseek");
exit(1);
}

ret = syscall(SYS_getdents, dir_fd, ptr, LEN);
if (ret <= 0) {
perror("getdents");
exit(1);
}
}
}

/* and the parent spins doing a system-wide sync */
while (1)
sync();

return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Kai Krakow
Chris Murphy  schrieb:

> On Tue, Mar 31, 2015 at 4:15 PM, Chris Murphy 
> wrote:
> 
>> The i915 regression right now is really annoying. With a Samsung 840
>> EVO I've  had inexplicable and non-deterministic boot failures.
> 
> Clarification: the boot failures happen following the i915 panic and
> subsequent forced power off.

Yeah I thought that, too, because after hitting "reset" it looked like one 
hard disk didn't appear in dmesg and thus btrfs didn't mount (btrfs-raid). 
So I turned the machine off completely because I had similar issues with 
i915 freezes and strange boot issues during the following boot before. It 
looks like the GPU is not necessarily completely reset when hitting the 
reset button. But that's another story.

In my case the hard disk was there - I didn't just scan hard enough through 
the huge pile of logs. I had to btrfs-zero-log, wrote "reboot" into the 
rescue shell, kernel came back, mount still locking up and sitting there 
until systemd decided to throw me to emergency after 5 minutes of waiting or 
so. I've rebooted again, machine came up. This was a few reboots after the 
machine was powered off, so I'd rule any GPU freeze artifacts out here. I 
just needed multiple reboots to arrange myself with my dracut/systemd combo 
super hero voodoo abilities (read: I cumbersome tried everything until one 
thing worked while swearing at my innocent monitor, well sort of, it's 
powered by the GPU).

On every reboot it felt like bcache was replaying cache transactions - but I 
think this is by design (read: bcache is always dirty, even after a clean 
shutdown, if using write-back mode) and not part of the problem.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs: add debugfs file to test transaction aborts

2015-03-31 Thread Filipe Manana
At the moment we can not reliably and deterministically test that the
transaction abortion code works as expected. For example in the past [1]
we had an issue where that code returned the pinned extents to the free
space caches allowing fstrim to perform a discard against the physical
locations of the extents, and worse, when the fs was mounted with the
option -o discard it explicitly did a discard on each pinned extent.
This resulted in filesystem corruption, leaving the fs unmountable.

This patch adds a debugfs file named abort_transaction, which has a default
default value of an empty string, can only be written by someone with root
privileges and when a string is written to it, it makes sure all subsequent
transaction commits fail at the very end (right before writing the new
superblock) if that string matches the label of the filesystem.
This way we can for example write a deterministic fstest for commit [1]
which looks like:

  _require_btrfs_debugfs()
  {
  if [ -d /sys/kernel/debug/btrfs ]; then
  BTRFS_DEBUG_FS=/sys/kernel/debug/btrfs
  elif [ -d /debug/btrfs ]; then
  BTRFS_DEBUG_FS=/debug
  else
  _notrun "btrfs debugfs not available"
  fi

  if [ ! -z $1 ]; then
  if [ ! -e $BTRFS_DEBUG_FS/$1 ]; then
  _notrun "btrfs debugfs path $1 not available"
  fi
  fi
  }

  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_btrfs_debugfs "abort_transaction"
  _need_to_be_root

  rm -f $seqres.full

  # We will abort a btrfs transaction later, which always produces a warning in
  # dmesg. We do not want the test to fail because of this.
  _disable_dmesg_check
  fslabel="btrfs_fstest_$seq"
  _scratch_mkfs -L $fslabel >>$seqres.full 2>&1
  _scratch_mount "-o discard"
  _require_batched_discard $SCRATCH_MNT

  # Create a file and commit the current transaction.
  echo -n "hello" > $SCRATCH_MNT/foo
  sync

  # Now update the file, which forces a COW operation of the fs root, adding
  # the old root location to the pinned extents list.
  echo -n " world" >> $SCRATCH_MNT/foo

  # Now abort the current transaction, unmount the fs, mount it again and verify
  # we can open the file and read its content (which should match what it had
  # when the last transaction committed successfully). Btrfs used to issue a
  # discard operation on the extents in the pinned extents list, resulting in
  # corruption of metadata and data, and used too to return the pinned extents
  # to the free space caches, allowing future fstrim operations to perform a
  # discard operation against the pinned exents.
  echo -n "$fslabel" > $BTRFS_DEBUG_FS/abort_transaction
  sync
  echo > $BTRFS_DEBUG_FS/abort_transaction
  $FSTRIM_PROG $SCRATCH_MNT

  _scratch_unmount
  _scratch_mount
  echo "File content after transaction abort + remount: $(cat $SCRATCH_MNT/foo)"

The test's expected output is:

  File content after transaction abort + remount: hello

With patch [1] reverted the test fails with:

  btrfs/088 2s ... - output mismatch (see 
/home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad)
  --- tests/btrfs/088.out   2015-03-31 19:31:17.558436298 +0100
  +++ /home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad
2015-03-31 19:58:12.741403640 +0100
  @@ -1,2 +1,8 @@
   QA output created by 088
  -File content after transaction abort + remount: hello
  +mount: wrong fs type, bad option, bad superblock on /dev/sdc,
  +   missing codepage or helper program, or other error
  +   In some cases useful info is found in syslog - try
  +   dmesg | tail  or so
  +
  ...
  (Run 'diff -u tests/btrfs/088.out 
/home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad'  to see the entire 
diff)
  _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent (see 
/home/fdmanana/git/hub/xfstests/results//btrfs/088.full)

  $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/088.full
  (...)
  _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
  *** fsck.btrfs output ***
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  read block failed check_tree_block
  Couldn't read tree root
  Couldn't open file system
  *** end fsck.btrfs output

With this feature we can also get a fstest for the issue fixed by the patch
that fixes log tree corruption when the fs is mounted with -o discard [2].

  "Btrfs: fix log tree corruption when fs mounted with -o discard"

[1] commit 678886bdc637 ("Btrfs: fix fs corruption on transaction abort
  if device supports discard")
[2] "Btrfs: fix log tree corruption when fs mounted with -o discard"

Signed-off-by: Filipe Manana 
---

V2: Allow this to select by label which filesystem will have its transaction
aborted. The previou

Re: Upgrade to 3.19.2 Kernel fails to boot

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 3:24 PM, Benjamin Hodgetts  wrote:
> G. Richard Bellamy  pteradigm.com> writes:
>
>>
>> When I upgrade to the 3.19.2 Kernel I get a deadlocked boot:
>> INFO: task mount:302 blocked for more than 120 seconds.
>> INFO: task btrfs-transacti:329 blocked for more than 120 seconds.
>>
>> I have an LTS Kernel at 3.14.35 that also fails to boot with the same
> behavior.
>>
>> My 3.18.6 works just fine.

Well there are only two backports in 3.14.35. Most of the ones found
in 3.19.2 are in 3.14.36.


commit f9e2ba638c32dff17ee6404e2c8245fd49d99b8b
Author: David Sterba 
Date:   Fri Jan 2 18:45:16 2015 +0100

btrfs: fix leak of path in btrfs_find_item

commit 381cf6587f8a8a8e981bc0c18859b51dc756 upstream.

If btrfs_find_item is called with NULL path it allocates one locally but
does not free it. Affected paths are inserting an orphan item for a file
and for a subvol root.

Move the path allocation to the callers.

Fixes: 3f870c289900 ("btrfs: expand btrfs_find_item() to include
find_orphan_item functionality")
Signed-off-by: David Sterba 
Signed-off-by: Greg Kroah-Hartman 

commit 74e42361fa3bc102647ad1e1ec7c21b747658843
Author: David Sterba 
Date:   Fri Dec 19 18:38:47 2014 +0100

btrfs: set proper message level for skinny metadata

commit 5efa0490cc94aee06cd8d282683e22a8ce0a0026 upstream.

This has been confusing people for too long, the message is really just
informative.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 4:15 PM, Chris Murphy  wrote:

> The i915 regression right now is really annoying. With a Samsung 840
> EVO I've  had inexplicable and non-deterministic boot failures.

Clarification: the boot failures happen following the i915 panic and
subsequent forced power off.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 3:45 PM, Kai Krakow  wrote:

> I had this happen, too, lately. It's quite often happening after an unclean
> shutdown (which currently quite often happend to me due to the xorg intel
> driver having GPU freezes). SysRq+W shows that the mount process is locked
> somewhere in the btrfs code path and won't quit if Ctrl+C'd...
>
> Only way to fix it was to btrfs-zero-log.

The i915 regression right now is really annoying. With a Samsung 840
EVO I've  had inexplicable and non-deterministic boot failures. When
running btrfs check from the initramfs  (booting with
rd.break=pre-mount) I get a very long pile of complaints... minutes of
scrolling text of horrible sounding problems. Yet the same btrfs-progs
and the same kernel from Fedora 22 install media, zero complaints, and
mounts fine.

So I have no idea what's going on right now. It even corrupts the EFI
System partition, these crashes.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 2:39 PM, Matt Grant
 wrote:
> Seen this before at home.
>
> You have to mount -o recovery off a 3.19 kernel to fix it...
>
> If you can get the SSD out, attach it to a desktop, as there will be no 
> Install CDs using 3.19 yet.

Fedora 22 Workstation alpha has 4.0.0 (rc1 I think), and the current
TC6 beta has 4.0.0-rc4. It's possible to use the netinstall, which is
much smaller, and use boot param single or rescue to avoid the
installer launching.

And actually 3.19.2 is the stable kernel for Fedora 21, with 3.19.3
just pushed today (take mirrors a day or two to catch up), not 3.18.9
as I reported earlier.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrade to 3.19.2 Kernel fails to boot

2015-03-31 Thread Benjamin Hodgetts
G. Richard Bellamy  pteradigm.com> writes:

> 
> When I upgrade to the 3.19.2 Kernel I get a deadlocked boot:
> INFO: task mount:302 blocked for more than 120 seconds.
> INFO: task btrfs-transacti:329 blocked for more than 120 seconds.
> 
> I have an LTS Kernel at 3.14.35 that also fails to boot with the same 
behavior.
> 
> My 3.18.6 works just fine.
> 
> -rb
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" 
in
> the body of a message to majordomo  vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

Same issue here. Rebooted after updating to 3.19.2 and now I get the 
"mount blocked" message and can't boot.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Kai Krakow
Kai Krakow  schrieb:

> Chris Murphy  schrieb:
> 
>> On Tue, Mar 31, 2015 at 2:09 PM, Chris Murphy 
>> wrote:
>> 
>>> Well you have to rule that out before anyone on this list can really
>>> help. Try booting Fedora 21 install media, and using smartctl -x on
>>> the drive.
>> 
>> While you're at it, try to mount the Btrfs volume in question normally
>> and report kernel messages. If mount fails, try it with -o recovery
>> mount option, and also report kernel messages and whether that fails.
> 
> I had this happen, too, lately. It's quite often happening after an
> unclean shutdown (which currently quite often happend to me due to the
> xorg intel driver having GPU freezes). SysRq+W shows that the mount
> process is locked somewhere in the btrfs code path and won't quit if
> Ctrl+C'd...
> 
> Only way to fix it was to btrfs-zero-log. But it still took some reboots
> from initramfs until it successfully mounted again (I could mount it in
> initramfs right after zero-log but upon reboot it hung again though at a
> different stage probably).
> 
> So I guess there's some race on the one hand (happens from time to time
> non- related to fixing it with zero-log), and a deadlock on the other hand
> after some unclean shutdowns (more or less random).
> 
> My setup is 3-device btrfs-mraid1-draid0 on bcache. Bcache wasn't involved
> in the backtrace of SysRq+W, however. Apparently I don't have a screenshot
> of it because my smart phone is currently fried...

BTW: I tried all kernels from current 3.19.x back to 3.18.0 which still live 
on my boot partition - each with the same result and very similar backtrace 
(SysRq+W)...

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Kai Krakow
Chris Murphy  schrieb:

> On Tue, Mar 31, 2015 at 2:09 PM, Chris Murphy 
> wrote:
> 
>> Well you have to rule that out before anyone on this list can really
>> help. Try booting Fedora 21 install media, and using smartctl -x on
>> the drive.
> 
> While you're at it, try to mount the Btrfs volume in question normally
> and report kernel messages. If mount fails, try it with -o recovery
> mount option, and also report kernel messages and whether that fails.

I had this happen, too, lately. It's quite often happening after an unclean 
shutdown (which currently quite often happend to me due to the xorg intel 
driver having GPU freezes). SysRq+W shows that the mount process is locked 
somewhere in the btrfs code path and won't quit if Ctrl+C'd...

Only way to fix it was to btrfs-zero-log. But it still took some reboots 
from initramfs until it successfully mounted again (I could mount it in 
initramfs right after zero-log but upon reboot it hung again though at a 
different stage probably).

So I guess there's some race on the one hand (happens from time to time non-
related to fixing it with zero-log), and a deadlock on the other hand after 
some unclean shutdowns (more or less random).

My setup is 3-device btrfs-mraid1-draid0 on bcache. Bcache wasn't involved 
in the backtrace of SysRq+W, however. Apparently I don't have a screenshot 
of it because my smart phone is currently fried...

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 2:09 PM, Chris Murphy  wrote:

> Well you have to rule that out before anyone on this list can really
> help. Try booting Fedora 21 install media, and using smartctl -x on
> the drive.

While you're at it, try to mount the Btrfs volume in question normally
and report kernel messages. If mount fails, try it with -o recovery
mount option, and also report kernel messages and whether that fails.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Chris Murphy
On Tue, Mar 31, 2015 at 12:55 PM, Martin Langhoff
 wrote:
> Hi BTRFS folks,
>
> one of my dev boxes is a Thinkpad x220, with a single hybrid
> (HDD+Flash) disk, running F21 with BTRFS partitions for /home and / .
>
> After losing power (ran out of battery, possibly while trying to
> hibernate) -- the system will not boot. The initrd breaks out to a
> shell where I find that the partition holding / is failing to mount.

There should be a reference to an rdsosreport.txt in /run/... so find
a way to get that posted somewhere.

>  - kernel is 3.1.9-200.fc21

This is probably 3.18.9, which is the current F21 kernel.

>A failure of the
> HDD cannot be ruled out, low power conditions, cheap consumer part...

Well you have to rule that out before anyone on this list can really
help. Try booting Fedora 21 install media, and using smartctl -x on
the drive.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: add debugfs file to test transaction aborts

2015-03-31 Thread Filipe Manana
At the moment we can not reliably and deterministically test that the
transaction abortion code works as expected. For example in the past [1]
we had an issue where that code returned the pinned extents to the free
space caches allowing fstrim to perform a discard against the physical
locations of the extents, and worse, when the fs was mounted with the
option -o discard it explicitly did a discard on each pinned extent.
This resulted in filesystem corruption, leaving the fs unmountable.

This patch adds a debugfs file named abort_transaction, which has a
default value of 0, can only be written by someone with root privileges
and when a non-zero value is written to it, it makes all subsequent
transaction commits fail at the very end (right before writing the new
superblock), which results in a transaction abortion.
This way we can for example write a deterministic fstest for commit [1]
which looks like:

  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_btrfs_debugfs "abort_transaction"
  _need_to_be_root

  rm -f $seqres.full

  # We will abort a btrfs transaction later, which always produces a warning in
  # dmesg. We do not want the test to fail because of this.
  _disable_dmesg_check

  _scratch_mkfs >>$seqres.full 2>&1
  _scratch_mount "-o discard"
  _require_batched_discard $SCRATCH_MNT

  # Create a file and commit the current transaction.
  echo -n "hello" > $SCRATCH_MNT/foo
  sync

  # Now update the file, which forces a COW operation of the fs root, adding
  # the old root location to the pinned extents list.
  echo -n " world" >> $SCRATCH_MNT/foo

  # Now abort the current transaction, unmount the fs, mount it again and verify
  # we can open the file and read its content (which should match what it had
  # when the last transaction committed successfully). Btrfs used to issue a
  # discard operation on the extents in the pinned extents list, resulting in
  # corruption of metadata and data, and used too to return the pinned extents
  # to the free space caches, allowing future fstrim operations to perform a
  # discard operation against the pinned exents.
  echo 1 > $BTRFS_DEBUG_FS/abort_transaction
  sync
  echo 0 > $BTRFS_DEBUG_FS/abort_transaction
  $FSTRIM_PROG $SCRATCH_MNT

  _scratch_unmount
  _scratch_mount
  echo "File content after transaction abort + remount: $(cat $SCRATCH_MNT/foo)"

The test's expected output is:

  File content after transaction abort + remount: hello

With patch [1] reverted the test fails with:

  btrfs/088 2s ... - output mismatch (see 
/home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad)
  --- tests/btrfs/088.out   2015-03-31 19:31:17.558436298 +0100
  +++ /home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad
2015-03-31 19:58:12.741403640 +0100
  @@ -1,2 +1,8 @@
   QA output created by 088
  -File content after transaction abort + remount: hello
  +mount: wrong fs type, bad option, bad superblock on /dev/sdc,
  +   missing codepage or helper program, or other error
  +   In some cases useful info is found in syslog - try
  +   dmesg | tail  or so
  +
  ...
  (Run 'diff -u tests/btrfs/088.out 
/home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad'  to see the entire 
diff)
  _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent (see 
/home/fdmanana/git/hub/xfstests/results//btrfs/088.full)

  $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/088.full
  (...)
  _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
  *** fsck.btrfs output ***
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  Check tree block failed, want=29573120, have=0
  read block failed check_tree_block
  Couldn't read tree root
  Couldn't open file system
  *** end fsck.btrfs output

[1] commit 678886bdc637 ("Btrfs: fix fs corruption on transaction abort
  if device supports discard")

Signed-off-by: Filipe Manana 
---

If there are no objections to the feature/interface, I will submit a test
for fstests.

 fs/btrfs/sysfs.c   | 11 +++
 fs/btrfs/sysfs.h   |  2 ++
 fs/btrfs/transaction.c |  9 +
 3 files changed, 22 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 94edb0a..4b10f3b 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -666,6 +666,7 @@ static struct dentry *btrfs_debugfs_root_dentry;
 
 /* Debugging tunables and exported data */
 u64 btrfs_debugfs_test;
+static u32 btrfs_debugfs_abort_transaction;
 
 int btrfs_sysfs_add_one(struct btrfs_fs_info *fs_info)
 {
@@ -719,10 +720,20 @@ static int btrfs_init_debugfs(void)
 
debugfs_create_u64("test", S_IRUGO | S_IWUGO, btrfs_debugfs_root_dentry,
&btrfs_debugfs_test);
+
+   debugfs_create_bool("abort_transaction",
+   S_IRUSR | S_IWUSR | S_

Re: Linux 4.0.0-rc5 - [ 1851.858652] BTRFS: error (device sda1) in cleanup_transaction:1686: errno=-17 Object already exists

2015-03-31 Thread Torbjørn

On 03/31/2015 08:12 PM, Chris Mason wrote:



On Mon, Mar 30, 2015 at 1:42 PM, Torbjørn  wrote:

Hi,

Just a follow up on this report.

The file system in question is a raid1 across 2x320G old Western 
Digital WD3200KS.
I janked them out of the server to run a fsck on another computer 
(after a proper shutdown).


One of the disks did not get properly detected on the secondary 
computer.

Hopefully the fsck of the single disk is still of some value to you.

As you can see, there are several issues with the fs.
The system has occasionally had hard reboots.

The fs does not have any real value for me. Everything worth anything 
is backed up.

I'll keep the drive around in case it's of any value for some devs.

As noted before: this (corrupted) fs only get errors when booting 
into 4.0-rc5. With 4.0-rc4 or earlier it works as if nothing is wrong.


This is really strange because we also have reports from v3.19 stable 
kernels, but none of the btrfs patches between rc4 and rc5 were tagged 
for stable.


Can I convince you to hammer a bit more on rc4?  I'd like to make sure 
it really was a regression introdcued in rc5.


-chris




Perhaps I was a bit unclear. The error is triggered when booting into 
rc5. If I reset and try to start up rc4 or earlier, it is still there. 
After zero-log I can boot into rc4.


I already did a fresh reinstall to alternative drives. I did not want to 
have the corrupted fs as root.


I'm not sure I can get the second disk in the raid1 to work. It was head 
crashing when trying to attach it for fsck. If I somehow get it back 
online I can do some more testing. Anything in particular?


--
Torbjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


F21 fails to mount root part, btrfs check: Couldn't open file system

2015-03-31 Thread Martin Langhoff
Hi BTRFS folks,

one of my dev boxes is a Thinkpad x220, with a single hybrid
(HDD+Flash) disk, running F21 with BTRFS partitions for /home and / .

After losing power (ran out of battery, possibly while trying to
hibernate) -- the system will not boot. The initrd breaks out to a
shell where I find that the partition holding / is failing to mount.

 - journalctl/dmesg doesn't show anything of interest, though it says
there is no hibernate image to restore, and that it cannot mount
sysroot
 - btrfs rescue super-recover says there's no problem, all superblocks ok
 - btrfs rescue chunk-recover -vy takes a while, and says there's
nothing to recover
 - btrfs check (tried all options) -- says Couldn't open file system
 - kernel is 3.1.9-200.fc21
 - btrfs tools 3.18.1
 - disk is a Seagate 500GB Thin SSHD SATA 6Gb/s 64MB Cache 2.5-Inch
(ST500LM000) -- cost ~$100 about 2y ago

Is there anything I can do here? Anything to recover the partition?
Anything to diagnose things in a way that is useful to btrfs
development, and understanding of dataloss crashes? A failure of the
HDD cannot be ruled out, low power conditions, cheap consumer part...

There is no valuable data here, just have to reinstall F21.

cheers,


martin [ I am not on the list, CC appreciated... ]
-- 
 martin.langh...@gmail.com
 -  ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 ~ http://docs.moodle.org/en/User:Martin_Langhoff
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.0.0-rc5 - [ 1851.858652] BTRFS: error (device sda1) in cleanup_transaction:1686: errno=-17 Object already exists

2015-03-31 Thread Chris Mason



On Mon, Mar 30, 2015 at 1:42 PM, Torbjørn  wrote:

Hi,

Just a follow up on this report.

The file system in question is a raid1 across 2x320G old Western 
Digital WD3200KS.
I janked them out of the server to run a fsck on another computer 
(after a proper shutdown).


One of the disks did not get properly detected on the secondary 
computer.

Hopefully the fsck of the single disk is still of some value to you.

As you can see, there are several issues with the fs.
The system has occasionally had hard reboots.

The fs does not have any real value for me. Everything worth anything 
is backed up.

I'll keep the drive around in case it's of any value for some devs.

As noted before: this (corrupted) fs only get errors when booting 
into 4.0-rc5. With 4.0-rc4 or earlier it works as if nothing is wrong.


This is really strange because we also have reports from v3.19 stable 
kernels, but none of the btrfs patches between rc4 and rc5 were tagged 
for stable.


Can I convince you to hammer a bit more on rc4?  I'd like to make sure 
it really was a regression introdcued in rc5.


-chris



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ERROR: error removing the device '/dev/sdXN' - Inappropriate ioctl for device

2015-03-31 Thread Martin
On 31/03/15 18:37, David Sterba wrote:
> On Mon, Mar 30, 2015 at 09:59:05PM +0100, Martin wrote:
>> btrfs device delete /dev/sdf5 /mnt/data2
>>
>> ERROR: error removing the device '/dev/sdf5' - Inappropriate ioctl for
>> device
> 
> Strange, that would mean that the ioctl number is not handled at all. I
> haven't found any change in device delete on the userspace side. This
> would mean that /mnt/data2 is not mounted. Did you try to do lazy
> unmount?

Nope, that filesystem is still mounted and in use, but still a very good
deduction...

Trying smartmonctl just now on that drive (/dev/sdf) gives errors
suggesting that the drive is no longer responding at all :-(

There is also a steady repeat of:

kernel: sd 7:0:0:0: [sdf] CDB:
kernel: cdb[0]=0x2a: 2a 00 04 02 45 80 00 00 08 00
kernel: sd 7:0:0:0: [sdf] Unhandled error code
kernel: sd 7:0:0:0: [sdf]
kernel: Result: hostbyte=0x04 driverbyte=0x00

(Must watch that lot doesn't fill the log.)


Should I try a:

btrfs device delete missing /mnt/data2

or do I need to unmount or remount first?


... And the system is live and not to be rebooted if at all possible
until the weekend...

Thanks,
Martin

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ERROR: error removing the device '/dev/sdXN' - Inappropriate ioctl for device

2015-03-31 Thread David Sterba
On Mon, Mar 30, 2015 at 09:59:05PM +0100, Martin wrote:
> btrfs device delete /dev/sdf5 /mnt/data2
> 
> ERROR: error removing the device '/dev/sdf5' - Inappropriate ioctl for
> device

Strange, that would mean that the ioctl number is not handled at all. I
haven't found any change in device delete on the userspace side. This
would mean that /mnt/data2 is not mounted. Did you try to do lazy
unmount?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] fstests: test for btrfs cloning of zero length ranges

2015-03-31 Thread David Sterba
On Tue, Mar 31, 2015 at 12:04:58AM +0100, Filipe Manana wrote:
> Test cloning a file range with a length of zero into a destination offset
> greater than zero.
> 
> This made btrfs create an extent state record with a start offset greater than
> the end offset, resulting in chaos such as an infinite loop when evicting an
> inode.
> 
> This issue was fixed by the following linux kernel patch:
> 
>Btrfs: fix inode eviction infinite loop after cloning into it
> 
> Signed-off-by: Filipe Manana 

Reviewed-by: David Sterba 

> V2: Use an offset of 64Kb so that the test can run on platforms with any
> page size. In btrfs the fs block size must be a multiple of the page
> size, so a 4Kb offset would make the test fail on machines with a
> page size > 4Kb because the clone ioctl only accepts offsets and lengths
> that are multiples of the block size.

Nice.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: snapshot destruction making IO extremely slow

2015-03-31 Thread David Sterba
On Mon, Mar 30, 2015 at 05:09:52PM +0200, Marc Cousin wrote:
> > So it would be good to sample the active threads and see where it's
> > spending the time. It could be the somewhere in the rb-tree representing
> > extents, but that's a guess.
> > 
> I just need to be told how to do that :)
> 
> Something like a perf top ?

for i in `pgrep btrfs`; do
ps $i
sudo cat /proc/$i/stack 2>/dev/null
done

and capture samples every second.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the wiki?

2015-03-31 Thread Dave Stevens

Quoting André-Sebastian Liebe :


Hi Dave,

you should forward your question directly to kernel.org webmaster
. They are using mediawiki in a multisite
environment for all kernel.org related wikis.

regards,
André


Thanks!

Dave



On 03/31/2015 04:00 AM, Dave Stevens wrote:

I've been reading the wiki to help sort out an issue and learn more
about RAID. I wanted to download it to my local drive to read offlibe
and that facility sseems to not be there. I checked the mediawiki
version in the source and it is an LTS version for which support ends
in two months. Are there plans to migrate the content to a newer vesion?

Dave









--
"As long as politics is the shadow cast on society by big business,
the attenuation of the shadow will not change the substance."

-- John Dewey




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fstests: regression test for btrfs file range cloning

2015-03-31 Thread Filipe Manana
Test btrfs file range cloning with the same file as a source and
destination.

This tests a specific scenario where the extent layout of the file
confused the clone ioctl implementation making it return -EEXIST to
userspace. This issue was fixed by the following linux kernel patch:

   Btrfs: fix range cloning when same inode used as source and destination

Signed-off-by: Filipe Manana 
---
 tests/btrfs/087 | 102 
 tests/btrfs/087.out |  35 ++
 tests/btrfs/group   |   1 +
 3 files changed, 138 insertions(+)
 create mode 100755 tests/btrfs/087
 create mode 100644 tests/btrfs/087.out

diff --git a/tests/btrfs/087 b/tests/btrfs/087
new file mode 100755
index 000..0a5deb7
--- /dev/null
+++ b/tests/btrfs/087
@@ -0,0 +1,102 @@
+#! /bin/bash
+# FS QA Test No. btrfs/087
+#
+# Test btrfs file range cloning with the same file as a source and destination.
+#
+# This tests a specific scenario where the extent layout of the file confused
+# the clone ioctl implementation making it return -EEXIST to userspace.
+# This issue was fixed by the following linux kernel patch:
+#
+#Btrfs: fix range cloning when same inode used as source and destination
+#
+#---
+# Copyright (C) 2015 SUSE Linux Products GmbH. All Rights Reserved.
+# Author: Filipe Manana 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_require_cloner
+_need_to_be_root
+
+rm -f $seqres.full
+
+# Create a file with an extent layout that confused the btrfs clone ioctl
+# implementation. The first extent item that is cloned by the second call
+# to the cloner program will have only a trailing part of it referenced by
+# a new extent item, since the source offset starts in the middle of that
+# extent. This confused the clone ioctl because after inserting this new
+# extent item it would immediately after process it again thinking it
+# corresponded to an extent that existed before - this made it attempt to
+# insert a duplicated extent item pointing to the same extent again, which
+# made it return an -EEXIST error to userspace and turn the filesystem to
+# readonly mode (since the current transaction got aborted).
+test_clone()
+{
+   local bs=$1
+
+   $XFS_IO_PROG -f -c "pwrite -S 0xaa $(($bs * 2)) $(($bs * 2))" \
+   $SCRATCH_MNT/foo | _filter_xfs_io
+
+   $CLONER_PROG -s $((3 * $bs)) -d $((267 * $bs)) -l 0 $SCRATCH_MNT/foo \
+   $SCRATCH_MNT/foo
+   $CLONER_PROG -s $((217 * $bs)) -d $((95 * $bs)) -l 0 $SCRATCH_MNT/foo \
+   $SCRATCH_MNT/foo
+
+   echo "File content after clone operations:"
+   od -t x1 $SCRATCH_MNT/foo
+}
+
+# Make sure the test passes offsets and lengths to the btrfs clone ioctl that
+# are multiples of the fs block size. Currently the block size on btrfs must
+# be a multiple of the page size, so use a 64Kb fs block size in order to be
+# able to test on every platform supported by linux.
+bs=$((64 * 1024))
+
+echo "Testing without the no-holes feature"
+_scratch_mkfs "-O ^no-holes -l $bs" >>$seqres.full 2>&1
+_scratch_mount
+test_clone $bs
+_check_scratch_fs
+
+echo "Testing with the no-holes feature"
+_scratch_unmount
+_scratch_mkfs "-O no-holes -l $bs" >>$seqres.full 2>&1
+_scratch_mount
+test_clone $bs
+
+status=0
+exit
diff --git a/tests/btrfs/087.out b/tests/btrfs/087.out
new file mode 100644
index 000..5ddede2
--- /dev/null
+++ b/tests/btrfs/087.out
@@ -0,0 +1,35 @@
+QA output created by 087
+Testing without the no-holes feature
+wrote 131072/131072 bytes at offset 131072
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+File content after clone operations:
+000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+*
+040 aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
+*
+100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+*
+4420 aa aa aa aa aa aa aa aa aa 

[PATCH] Btrfs: fix range cloning when same inode used as source and destination

2015-03-31 Thread Filipe Manana
While searching for extents to clone we might find one where we only use
a part of it coming from its tail. If our destination inode is the same
the source inode, we end up removing the tail part of the extent item and
insert after a new one that point to the same extent with an adjusted
key file offset and data offset. After this we search for the next extent
item in the fs/subvol tree with a key that has an offset incremented by
one. But this second search leaves us at the new extent item we inserted
previously, and since that extent item has a non-zero data offset, it
it can make us call btrfs_drop_extents with an empty range (start == end)
which causes the following warning:

[23978.537119] WARNING: CPU: 6 PID: 16251 at fs/btrfs/file.c:550 
btrfs_drop_extent_cache+0x43/0x385 [btrfs]()
(...)
[23978.557266] Call Trace:
[23978.557978]  [] dump_stack+0x4c/0x65
[23978.559191]  [] warn_slowpath_common+0xa1/0xbb
[23978.560699]  [] ? btrfs_drop_extent_cache+0x43/0x385 
[btrfs]
[23978.562389]  [] warn_slowpath_null+0x1a/0x1c
[23978.563613]  [] btrfs_drop_extent_cache+0x43/0x385 [btrfs]
[23978.565103]  [] ? time_hardirqs_off+0x15/0x28
[23978.566294]  [] ? trace_hardirqs_off+0xd/0xf
[23978.567438]  [] __btrfs_drop_extents+0x6b/0x9e1 [btrfs]
[23978.568702]  [] ? trace_hardirqs_on+0xd/0xf
[23978.569763]  [] ? cache_alloc+0x69/0x2eb
[23978.570817]  [] ? virt_to_head_page+0x9/0x36
[23978.571872]  [] ? 
cache_alloc_debugcheck_after.isra.42+0x16c/0x1cb
[23978.573466]  [] ? 
kmemleak_alloc_recursive.constprop.52+0x16/0x18
[23978.574962]  [] btrfs_drop_extents+0x66/0x7f [btrfs]
[23978.576179]  [] btrfs_clone+0x516/0xaf5 [btrfs]
[23978.577311]  [] ? lock_extent_range+0x7b/0xcd [btrfs]
[23978.578520]  [] btrfs_ioctl_clone+0x28e/0x39f [btrfs]
[23978.580282]  [] btrfs_ioctl+0xb51/0x219a [btrfs]
(...)
[23978.591887] ---[ end trace 988ec2a653d03ed3 ]---

Then we attempt to insert a new extent item with a key that already
exists, which makes btrfs_insert_empty_item return -EEXIST resulting in
abortion of the current transaction:

[23978.594355] WARNING: CPU: 6 PID: 16251 at fs/btrfs/super.c:260 
__btrfs_abort_transaction+0x52/0x114 [btrfs]()
(...)
[23978.622589] Call Trace:
[23978.623181]  [] dump_stack+0x4c/0x65
[23978.624359]  [] warn_slowpath_common+0xa1/0xbb
[23978.625573]  [] ? __btrfs_abort_transaction+0x52/0x114 
[btrfs]
[23978.626971]  [] warn_slowpath_fmt+0x46/0x48
[23978.628003]  [] ? vprintk_default+0x1d/0x1f
[23978.629138]  [] __btrfs_abort_transaction+0x52/0x114 
[btrfs]
[23978.630528]  [] btrfs_clone+0x7fc/0xaf5 [btrfs]
[23978.631635]  [] ? lock_extent_range+0x7b/0xcd [btrfs]
[23978.632886]  [] btrfs_ioctl_clone+0x28e/0x39f [btrfs]
[23978.634119]  [] btrfs_ioctl+0xb51/0x219a [btrfs]
(...)
[23978.647714] ---[ end trace 988ec2a653d03ed4 ]---

This is wrong because we should not process the extent item that we just
inserted previously, and instead process the extent item that follows it
in the tree

For example for the test case I wrote for fstests:

   bs=$((64 * 1024))
   mkfs.btrfs -f -l $bs -O ^no-holes /dev/sdc
   mount /dev/sdc /mnt

   xfs_io -f -c "pwrite -S 0xaa $(($bs * 2)) $(($bs * 2))" /mnt/foo

   $CLONER_PROG -s $((3 * $bs)) -d $((267 * $bs)) -l 0 /mnt/foo /mnt/foo
   $CLONER_PROG -s $((217 * $bs)) -d $((95 * $bs)) -l 0 /mnt/foo /mnt/foo

The second clone call fails with -EEXIST, because when we process the
first extent item (offset 262144), we drop part of it (counting from the
end) and then insert a new extent item with a key greater then the key we
found. The next time we search the tree we search for a key with offset
262144 + 1, which leaves us at the new extent item we have just inserted
but we think it refers to an extent that we need to clone.

Fix this by ensuring the next search key uses an offset corresponding to
the offset of the key we found previously plus the data length of the
corresponding extent item. This ensures we skip new extent items that we
inserted and works for the case of implicit holes too (NO_HOLES feature).

A test case for fstests follows soon.

Signed-off-by: Filipe Manana 
---
 fs/btrfs/ioctl.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 09a566a..869e39d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3205,6 +3205,8 @@ static int btrfs_clone(struct inode *src, struct inode 
*inode,
key.offset = off;
 
while (1) {
+   u64 next_key_min_offset;
+
/*
 * note the key will change type as we walk through the
 * tree.
@@ -3285,7 +3287,7 @@ process_slot:
} else if (key.offset >= off + len) {
break;
}
-
+   next_key_min_offset = key.offset + datal;
size = btrfs_item_size_nr(leaf, slot);
read_extent_buffer(leaf, buf,
   btrfs_item_ptr_offs

Re: [PATCH 1/1 linux-next] Btrfs: use BTRFS_COMPRESS_NONE instead of 0

2015-03-31 Thread David Sterba
On Mon, Mar 30, 2015 at 10:21:11PM +0200, Fabian Frederick wrote:
> cow_file_range_inline() was called with 0 instead of
> actual definition.

Please also fix the initialization in btrfs_finish_ordered_io .
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: explicitly set control file's private_data

2015-03-31 Thread Martin Kepplinger

Am 31.03.2015 14:31 schrieb Tom Van Braeckel:

Err, upon further inspection, I think that this was a false positive.

Btrfs relies on the initial value of the private_data member of a file
being NULL in the regular ioctl operation handler for
BTRFS_IOC_TRANS_START but it does not use the miscdevice framework for
those files.

It *does* use the miscdevice framework in the ioctl operation handler
of the /dev/btrfs-control file but there it does not use the file's
private_data member. So IMHO, the proposed patch is not necessary...


This is offtopic, assuming you are right and didn't find more affected 
places:


Then I would say you could re-post the real change (to misc_open() ) to 
the
relevant people for 4.2 (not 4.1), so either wait for 4.0 to be released 
or try
something like "for 4.2" in the topic (or as a comment after the --- 
dashes in

the patch email)

I would want to have it in -next for one cycle at least.

Further, I would remove the code-comment you had here
https://lkml.org/lkml/2015/1/9/718 because GregKH already pulled this in
(a little too early ;) :
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=03190c67ff72b5c56b24266762ab8abe68970f45
which is extractable kernel documenation. You could somehow link to it
in the commit message.

  martin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Send/Receive "utime" error And "WARN_ON Cannot allocate memory"

2015-03-31 Thread Filipe David Manana
On Tue, Mar 31, 2015 at 11:35 AM, Robbie Ko  wrote:
> Hi,
>
> At first, I use fsstress with following script to do random move.
> Errors could happen after several rounds of incremental send.
> Directory structure generated by fsstress is hard to find simple
> reproducible steps, so I started to make a small directory structure
> which has several files/directories with the same name.
> Then I run the following scripts to randomly move the directory and do
> send/receive. Error might occur several rounds later.
>
> while true;
>  while true;
>  do
>   if [ 1 -eq $((RANDOM%2)) ]; then
>src=`find $subvolume_path -type d | sort -R | head -1`
>   else
>src=`find $subvolume_path -type f | sort -R | head -1`
>   fi
>   if [ "$src" != "$subvolume_path " ]; then
>break
>   fi
>  done
>  while true;
>  do
>   dst=`find $subvolume_path -type d | sort -R | head -1`
>   if [[ "$dst" != "$src"/* ]] && [ "$dst" != "$src" ]; then
>break
>   fi
>  done
>  mv $src $dst
> done

Thanks Robbie.
That helps a lot.

>
> Thanks.
> robbieko
>
> 2015-03-31 17:50 GMT+08:00 Filipe David Manana :
>> On Tue, Mar 31, 2015 at 10:39 AM, Robbie Ko  wrote:
>>> HI,
>>>
>>> After taking a look at design of send/receive, I feel that severe
>>> directory structure movement combined with name collision might be a
>>> trouble maker for send to calculate the snapshot difference.
>>
>> Well the problem is not calculating the differences between snapshots.
>> The algorithm of finding new and deleted nodes/leafs in the second
>> snapshot, and the new/changed/deleted items is ok. The problem is the
>> ability to generate the correct paths (used for renames and other
>> operations) at the right moments in time, since the items are
>> processed only once and from left to right.
>>
>>> Therefore, I started using fsstress + random directory movement to
>>> test.
>>
>> Well fsstress alone won't generate these examples you have been
>> providing, because it never reuses old names (delete file/dir X, and
>> the name X won't ever be reused, rename A to B and name A won't ever
>> be used again, etc). So you gotta have something smart there that does
>> the directory movement in such a way that preserves names and inverts
>> the ancestor-descent relation between directories.
>> What is it exactly you are running?
>>
>>> Since the diff commands might contain a lot of instructions, I
>>> try to simplify them before reporting.
>>
>> Thanks for simplifying it.
>>
>>>
>>> Thanks for your help.
>>>
>>> Thanks.
>>> robbieko
>>>
>>> 2015-03-31 16:40 GMT+08:00 Filipe David Manana :
 On Tue, Mar 31, 2015 at 4:06 AM, Robbie Ko  wrote:
> Hi,
>
> I have  testing btrfs send/receive recently.
>
> I got an error "send ioctl failed with -12: Cannot allocate memory" on
> send side.
> WARN_ON happened on "len > PATH_MAX" in fs_path_ensure_buf.
>
> I got an error "utime failed: No such file or directory" on receive side.
> The followings are simple reproduced steps and related information.
>
> I have applied the following patch in the url :
> https://patchwork.kernel.org/patch/6113581/.
> However, it did not fix the problem.
>
> Is there any  idea about what this might be or how to fix it?
>
> Steps to reproduce:
>
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
>   $ mkfs.btrfs -f /dev/sdc
>   $ mount /dev/sdc /mnt2
>
> $ mkdir -p /mnt/data
> $ mkdir -p /mnt/data/n1/n2
> $ mkdir -p /mnt/data/n4
> $ mkdir -p /mnt/data/t6/t7
> $ mkdir -p /mnt/data/t5
> $ mkdir -p /mnt/data/t7
> $ mkdir -p /mnt/data/n4/t2
> $ mkdir -p /mnt/data/t4
> $ mkdir -p /mnt/data/t3
> $ mv /mnt/data/t7 /mnt/data/n4/t2
> $ mv /mnt/data/t4 /mnt/data/n4/t2/t7
> $ mv /mnt/data/t5 /mnt/data/n4/t2/t7/t4
> $ mv /mnt/data/t6 /mnt/data/n4/t2/t7/t4/t5
> $ mv /mnt/data/n1/n2 /mnt/data/n4/t2/t7/t4/t5/t6
> $ mv /mnt/data/n1 /mnt/data/n4/t2/t7/t4/t5/t6
> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/t7 /mnt/data/n4/t2/t7/t4/t5/t6/n2
> $ mv /mnt/data/t3 /mnt/data/n4/t2/t7/t4/t5/t6/n2/t7
>
>   $ btrfs subvolume snapshot -r /mnt /mnt/snap1
>
> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/n1 /mnt/data/n4
> $ mv /mnt/data/n4/t2 /mnt/data/n4/n1
> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6/n2 /mnt/data/n4/n1/t2
> $ mv /mnt/data/n4/n1/t2/n2/t7/t3 /mnt/data/n4/n1/t2
> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6 /mnt/data/n4/n1/t2
> $ mv /mnt/data/n4/n1/t2/t7/t4 /mnt/data/n4/n1/t2/t6
> $ mv /mnt/data/n4/n1/t2/t7 /mnt/data/n4/n1/t2/t3
> $ mv /mnt/data/n4/n1/t2/n2/t7 /mnt/data/n4/n1/t2
>
>   $ btrfs subvolume snapshot -r /mnt /mnt/snap2
>
>   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
>   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
> ERROR: send ioctl failed with 

Re: [PATCH v2] btrfs: explicitly set control file's private_data

2015-03-31 Thread Tom Van Braeckel
Err, upon further inspection, I think that this was a false positive.

Btrfs relies on the initial value of the private_data member of a file
being NULL in the regular ioctl operation handler for
BTRFS_IOC_TRANS_START but it does not use the miscdevice framework for
those files.

It *does* use the miscdevice framework in the ioctl operation handler
of the /dev/btrfs-control file but there it does not use the file's
private_data member. So IMHO, the proposed patch is not necessary...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Send/Receive "utime" error And "WARN_ON Cannot allocate memory"

2015-03-31 Thread Robbie Ko
Hi,

At first, I use fsstress with following script to do random move.
Errors could happen after several rounds of incremental send.
Directory structure generated by fsstress is hard to find simple
reproducible steps, so I started to make a small directory structure
which has several files/directories with the same name.
Then I run the following scripts to randomly move the directory and do
send/receive. Error might occur several rounds later.

while true;
 while true;
 do
  if [ 1 -eq $((RANDOM%2)) ]; then
   src=`find $subvolume_path -type d | sort -R | head -1`
  else
   src=`find $subvolume_path -type f | sort -R | head -1`
  fi
  if [ "$src" != "$subvolume_path " ]; then
   break
  fi
 done
 while true;
 do
  dst=`find $subvolume_path -type d | sort -R | head -1`
  if [[ "$dst" != "$src"/* ]] && [ "$dst" != "$src" ]; then
   break
  fi
 done
 mv $src $dst
done

Thanks.
robbieko

2015-03-31 17:50 GMT+08:00 Filipe David Manana :
> On Tue, Mar 31, 2015 at 10:39 AM, Robbie Ko  wrote:
>> HI,
>>
>> After taking a look at design of send/receive, I feel that severe
>> directory structure movement combined with name collision might be a
>> trouble maker for send to calculate the snapshot difference.
>
> Well the problem is not calculating the differences between snapshots.
> The algorithm of finding new and deleted nodes/leafs in the second
> snapshot, and the new/changed/deleted items is ok. The problem is the
> ability to generate the correct paths (used for renames and other
> operations) at the right moments in time, since the items are
> processed only once and from left to right.
>
>> Therefore, I started using fsstress + random directory movement to
>> test.
>
> Well fsstress alone won't generate these examples you have been
> providing, because it never reuses old names (delete file/dir X, and
> the name X won't ever be reused, rename A to B and name A won't ever
> be used again, etc). So you gotta have something smart there that does
> the directory movement in such a way that preserves names and inverts
> the ancestor-descent relation between directories.
> What is it exactly you are running?
>
>> Since the diff commands might contain a lot of instructions, I
>> try to simplify them before reporting.
>
> Thanks for simplifying it.
>
>>
>> Thanks for your help.
>>
>> Thanks.
>> robbieko
>>
>> 2015-03-31 16:40 GMT+08:00 Filipe David Manana :
>>> On Tue, Mar 31, 2015 at 4:06 AM, Robbie Ko  wrote:
 Hi,

 I have  testing btrfs send/receive recently.

 I got an error "send ioctl failed with -12: Cannot allocate memory" on
 send side.
 WARN_ON happened on "len > PATH_MAX" in fs_path_ensure_buf.

 I got an error "utime failed: No such file or directory" on receive side.
 The followings are simple reproduced steps and related information.

 I have applied the following patch in the url :
 https://patchwork.kernel.org/patch/6113581/.
 However, it did not fix the problem.

 Is there any  idea about what this might be or how to fix it?

 Steps to reproduce:

   $ mkfs.btrfs -f /dev/sdb
   $ mount /dev/sdb /mnt
   $ mkfs.btrfs -f /dev/sdc
   $ mount /dev/sdc /mnt2

 $ mkdir -p /mnt/data
 $ mkdir -p /mnt/data/n1/n2
 $ mkdir -p /mnt/data/n4
 $ mkdir -p /mnt/data/t6/t7
 $ mkdir -p /mnt/data/t5
 $ mkdir -p /mnt/data/t7
 $ mkdir -p /mnt/data/n4/t2
 $ mkdir -p /mnt/data/t4
 $ mkdir -p /mnt/data/t3
 $ mv /mnt/data/t7 /mnt/data/n4/t2
 $ mv /mnt/data/t4 /mnt/data/n4/t2/t7
 $ mv /mnt/data/t5 /mnt/data/n4/t2/t7/t4
 $ mv /mnt/data/t6 /mnt/data/n4/t2/t7/t4/t5
 $ mv /mnt/data/n1/n2 /mnt/data/n4/t2/t7/t4/t5/t6
 $ mv /mnt/data/n1 /mnt/data/n4/t2/t7/t4/t5/t6
 $ mv /mnt/data/n4/t2/t7/t4/t5/t6/t7 /mnt/data/n4/t2/t7/t4/t5/t6/n2
 $ mv /mnt/data/t3 /mnt/data/n4/t2/t7/t4/t5/t6/n2/t7

   $ btrfs subvolume snapshot -r /mnt /mnt/snap1

 $ mv /mnt/data/n4/t2/t7/t4/t5/t6/n1 /mnt/data/n4
 $ mv /mnt/data/n4/t2 /mnt/data/n4/n1
 $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6/n2 /mnt/data/n4/n1/t2
 $ mv /mnt/data/n4/n1/t2/n2/t7/t3 /mnt/data/n4/n1/t2
 $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6 /mnt/data/n4/n1/t2
 $ mv /mnt/data/n4/n1/t2/t7/t4 /mnt/data/n4/n1/t2/t6
 $ mv /mnt/data/n4/n1/t2/t7 /mnt/data/n4/n1/t2/t3
 $ mv /mnt/data/n4/n1/t2/n2/t7 /mnt/data/n4/n1/t2

   $ btrfs subvolume snapshot -r /mnt /mnt/snap2

   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
 ERROR: send ioctl failed with -12: Cannot allocate memory
 ERROR: utimes data/n4/t2/t7/t4/t5/t6/n2 failed. No such file or directory.
>>>
>>>
>>> Hi,
>>>
>>> May I ask if these reproducers come from a real application/workload
>>> or are they generated through some fuzz tester

Re: Btrfs Send/Receive "utime" error And "WARN_ON Cannot allocate memory"

2015-03-31 Thread Filipe David Manana
On Tue, Mar 31, 2015 at 10:39 AM, Robbie Ko  wrote:
> HI,
>
> After taking a look at design of send/receive, I feel that severe
> directory structure movement combined with name collision might be a
> trouble maker for send to calculate the snapshot difference.

Well the problem is not calculating the differences between snapshots.
The algorithm of finding new and deleted nodes/leafs in the second
snapshot, and the new/changed/deleted items is ok. The problem is the
ability to generate the correct paths (used for renames and other
operations) at the right moments in time, since the items are
processed only once and from left to right.

> Therefore, I started using fsstress + random directory movement to
> test.

Well fsstress alone won't generate these examples you have been
providing, because it never reuses old names (delete file/dir X, and
the name X won't ever be reused, rename A to B and name A won't ever
be used again, etc). So you gotta have something smart there that does
the directory movement in such a way that preserves names and inverts
the ancestor-descent relation between directories.
What is it exactly you are running?

> Since the diff commands might contain a lot of instructions, I
> try to simplify them before reporting.

Thanks for simplifying it.

>
> Thanks for your help.
>
> Thanks.
> robbieko
>
> 2015-03-31 16:40 GMT+08:00 Filipe David Manana :
>> On Tue, Mar 31, 2015 at 4:06 AM, Robbie Ko  wrote:
>>> Hi,
>>>
>>> I have  testing btrfs send/receive recently.
>>>
>>> I got an error "send ioctl failed with -12: Cannot allocate memory" on
>>> send side.
>>> WARN_ON happened on "len > PATH_MAX" in fs_path_ensure_buf.
>>>
>>> I got an error "utime failed: No such file or directory" on receive side.
>>> The followings are simple reproduced steps and related information.
>>>
>>> I have applied the following patch in the url :
>>> https://patchwork.kernel.org/patch/6113581/.
>>> However, it did not fix the problem.
>>>
>>> Is there any  idea about what this might be or how to fix it?
>>>
>>> Steps to reproduce:
>>>
>>>   $ mkfs.btrfs -f /dev/sdb
>>>   $ mount /dev/sdb /mnt
>>>   $ mkfs.btrfs -f /dev/sdc
>>>   $ mount /dev/sdc /mnt2
>>>
>>> $ mkdir -p /mnt/data
>>> $ mkdir -p /mnt/data/n1/n2
>>> $ mkdir -p /mnt/data/n4
>>> $ mkdir -p /mnt/data/t6/t7
>>> $ mkdir -p /mnt/data/t5
>>> $ mkdir -p /mnt/data/t7
>>> $ mkdir -p /mnt/data/n4/t2
>>> $ mkdir -p /mnt/data/t4
>>> $ mkdir -p /mnt/data/t3
>>> $ mv /mnt/data/t7 /mnt/data/n4/t2
>>> $ mv /mnt/data/t4 /mnt/data/n4/t2/t7
>>> $ mv /mnt/data/t5 /mnt/data/n4/t2/t7/t4
>>> $ mv /mnt/data/t6 /mnt/data/n4/t2/t7/t4/t5
>>> $ mv /mnt/data/n1/n2 /mnt/data/n4/t2/t7/t4/t5/t6
>>> $ mv /mnt/data/n1 /mnt/data/n4/t2/t7/t4/t5/t6
>>> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/t7 /mnt/data/n4/t2/t7/t4/t5/t6/n2
>>> $ mv /mnt/data/t3 /mnt/data/n4/t2/t7/t4/t5/t6/n2/t7
>>>
>>>   $ btrfs subvolume snapshot -r /mnt /mnt/snap1
>>>
>>> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/n1 /mnt/data/n4
>>> $ mv /mnt/data/n4/t2 /mnt/data/n4/n1
>>> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6/n2 /mnt/data/n4/n1/t2
>>> $ mv /mnt/data/n4/n1/t2/n2/t7/t3 /mnt/data/n4/n1/t2
>>> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6 /mnt/data/n4/n1/t2
>>> $ mv /mnt/data/n4/n1/t2/t7/t4 /mnt/data/n4/n1/t2/t6
>>> $ mv /mnt/data/n4/n1/t2/t7 /mnt/data/n4/n1/t2/t3
>>> $ mv /mnt/data/n4/n1/t2/n2/t7 /mnt/data/n4/n1/t2
>>>
>>>   $ btrfs subvolume snapshot -r /mnt /mnt/snap2
>>>
>>>   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
>>>   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
>>> ERROR: send ioctl failed with -12: Cannot allocate memory
>>> ERROR: utimes data/n4/t2/t7/t4/t5/t6/n2 failed. No such file or directory.
>>
>>
>> Hi,
>>
>> May I ask if these reproducers come from a real application/workload
>> or are they generated through some fuzz tester (like fsstress)?
>> It would be easier, and save time, if you mention how to get such
>> reproducers instead of sending the same e-mail over and over again but
>> with different reproducers.
>>
>> I'll take a look at it soon.
>>
>> thanks
>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> --
>> Filipe David Manana,
>>
>> "Reasonable men adapt themselves to the world.
>>  Unreasonable men adapt the world to themselves.
>>  That's why all progress depends on unreasonable men."



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Send/Receive "utime" error And "WARN_ON Cannot allocate memory"

2015-03-31 Thread Robbie Ko
HI,

After taking a look at design of send/receive, I feel that severe
directory structure movement combined with name collision might be a
trouble maker for send to calculate the snapshot difference.
Therefore, I started using fsstress + random directory movement to
test. Since the diff commands might contain a lot of instructions, I
try to simplify them before reporting.

Thanks for your help.

Thanks.
robbieko

2015-03-31 16:40 GMT+08:00 Filipe David Manana :
> On Tue, Mar 31, 2015 at 4:06 AM, Robbie Ko  wrote:
>> Hi,
>>
>> I have  testing btrfs send/receive recently.
>>
>> I got an error "send ioctl failed with -12: Cannot allocate memory" on
>> send side.
>> WARN_ON happened on "len > PATH_MAX" in fs_path_ensure_buf.
>>
>> I got an error "utime failed: No such file or directory" on receive side.
>> The followings are simple reproduced steps and related information.
>>
>> I have applied the following patch in the url :
>> https://patchwork.kernel.org/patch/6113581/.
>> However, it did not fix the problem.
>>
>> Is there any  idea about what this might be or how to fix it?
>>
>> Steps to reproduce:
>>
>>   $ mkfs.btrfs -f /dev/sdb
>>   $ mount /dev/sdb /mnt
>>   $ mkfs.btrfs -f /dev/sdc
>>   $ mount /dev/sdc /mnt2
>>
>> $ mkdir -p /mnt/data
>> $ mkdir -p /mnt/data/n1/n2
>> $ mkdir -p /mnt/data/n4
>> $ mkdir -p /mnt/data/t6/t7
>> $ mkdir -p /mnt/data/t5
>> $ mkdir -p /mnt/data/t7
>> $ mkdir -p /mnt/data/n4/t2
>> $ mkdir -p /mnt/data/t4
>> $ mkdir -p /mnt/data/t3
>> $ mv /mnt/data/t7 /mnt/data/n4/t2
>> $ mv /mnt/data/t4 /mnt/data/n4/t2/t7
>> $ mv /mnt/data/t5 /mnt/data/n4/t2/t7/t4
>> $ mv /mnt/data/t6 /mnt/data/n4/t2/t7/t4/t5
>> $ mv /mnt/data/n1/n2 /mnt/data/n4/t2/t7/t4/t5/t6
>> $ mv /mnt/data/n1 /mnt/data/n4/t2/t7/t4/t5/t6
>> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/t7 /mnt/data/n4/t2/t7/t4/t5/t6/n2
>> $ mv /mnt/data/t3 /mnt/data/n4/t2/t7/t4/t5/t6/n2/t7
>>
>>   $ btrfs subvolume snapshot -r /mnt /mnt/snap1
>>
>> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/n1 /mnt/data/n4
>> $ mv /mnt/data/n4/t2 /mnt/data/n4/n1
>> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6/n2 /mnt/data/n4/n1/t2
>> $ mv /mnt/data/n4/n1/t2/n2/t7/t3 /mnt/data/n4/n1/t2
>> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6 /mnt/data/n4/n1/t2
>> $ mv /mnt/data/n4/n1/t2/t7/t4 /mnt/data/n4/n1/t2/t6
>> $ mv /mnt/data/n4/n1/t2/t7 /mnt/data/n4/n1/t2/t3
>> $ mv /mnt/data/n4/n1/t2/n2/t7 /mnt/data/n4/n1/t2
>>
>>   $ btrfs subvolume snapshot -r /mnt /mnt/snap2
>>
>>   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
>>   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
>> ERROR: send ioctl failed with -12: Cannot allocate memory
>> ERROR: utimes data/n4/t2/t7/t4/t5/t6/n2 failed. No such file or directory.
>
>
> Hi,
>
> May I ask if these reproducers come from a real application/workload
> or are they generated through some fuzz tester (like fsstress)?
> It would be easier, and save time, if you mention how to get such
> reproducers instead of sending the same e-mail over and over again but
> with different reproducers.
>
> I'll take a look at it soon.
>
> thanks
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Filipe David Manana,
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the wiki?

2015-03-31 Thread André-Sebastian Liebe
Hi Dave,

you should forward your question directly to kernel.org webmaster
. They are using mediawiki in a multisite
environment for all kernel.org related wikis.

regards,
André

On 03/31/2015 04:00 AM, Dave Stevens wrote:
> I've been reading the wiki to help sort out an issue and learn more
> about RAID. I wanted to download it to my local drive to read offlibe
> and that facility sseems to not be there. I checked the mediawiki
> version in the source and it is an LTS version for which support ends
> in two months. Are there plans to migrate the content to a newer vesion?
>
> Dave
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Send/Receive "utime" error And "WARN_ON Cannot allocate memory"

2015-03-31 Thread Filipe David Manana
On Tue, Mar 31, 2015 at 4:06 AM, Robbie Ko  wrote:
> Hi,
>
> I have  testing btrfs send/receive recently.
>
> I got an error "send ioctl failed with -12: Cannot allocate memory" on
> send side.
> WARN_ON happened on "len > PATH_MAX" in fs_path_ensure_buf.
>
> I got an error "utime failed: No such file or directory" on receive side.
> The followings are simple reproduced steps and related information.
>
> I have applied the following patch in the url :
> https://patchwork.kernel.org/patch/6113581/.
> However, it did not fix the problem.
>
> Is there any  idea about what this might be or how to fix it?
>
> Steps to reproduce:
>
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
>   $ mkfs.btrfs -f /dev/sdc
>   $ mount /dev/sdc /mnt2
>
> $ mkdir -p /mnt/data
> $ mkdir -p /mnt/data/n1/n2
> $ mkdir -p /mnt/data/n4
> $ mkdir -p /mnt/data/t6/t7
> $ mkdir -p /mnt/data/t5
> $ mkdir -p /mnt/data/t7
> $ mkdir -p /mnt/data/n4/t2
> $ mkdir -p /mnt/data/t4
> $ mkdir -p /mnt/data/t3
> $ mv /mnt/data/t7 /mnt/data/n4/t2
> $ mv /mnt/data/t4 /mnt/data/n4/t2/t7
> $ mv /mnt/data/t5 /mnt/data/n4/t2/t7/t4
> $ mv /mnt/data/t6 /mnt/data/n4/t2/t7/t4/t5
> $ mv /mnt/data/n1/n2 /mnt/data/n4/t2/t7/t4/t5/t6
> $ mv /mnt/data/n1 /mnt/data/n4/t2/t7/t4/t5/t6
> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/t7 /mnt/data/n4/t2/t7/t4/t5/t6/n2
> $ mv /mnt/data/t3 /mnt/data/n4/t2/t7/t4/t5/t6/n2/t7
>
>   $ btrfs subvolume snapshot -r /mnt /mnt/snap1
>
> $ mv /mnt/data/n4/t2/t7/t4/t5/t6/n1 /mnt/data/n4
> $ mv /mnt/data/n4/t2 /mnt/data/n4/n1
> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6/n2 /mnt/data/n4/n1/t2
> $ mv /mnt/data/n4/n1/t2/n2/t7/t3 /mnt/data/n4/n1/t2
> $ mv /mnt/data/n4/n1/t2/t7/t4/t5/t6 /mnt/data/n4/n1/t2
> $ mv /mnt/data/n4/n1/t2/t7/t4 /mnt/data/n4/n1/t2/t6
> $ mv /mnt/data/n4/n1/t2/t7 /mnt/data/n4/n1/t2/t3
> $ mv /mnt/data/n4/n1/t2/n2/t7 /mnt/data/n4/n1/t2
>
>   $ btrfs subvolume snapshot -r /mnt /mnt/snap2
>
>   $ btrfs send /mnt/snap1 | btrfs receive /mnt2
>   $ btrfs send -p /mnt/snap1 /mnt/snap2 | btrfs receive /mnt2
> ERROR: send ioctl failed with -12: Cannot allocate memory
> ERROR: utimes data/n4/t2/t7/t4/t5/t6/n2 failed. No such file or directory.


Hi,

May I ask if these reproducers come from a real application/workload
or are they generated through some fuzz tester (like fsstress)?
It would be easier, and save time, if you mention how to get such
reproducers instead of sending the same e-mail over and over again but
with different reproducers.

I'll take a look at it soon.

thanks

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: wait for delayed iputs on no space

2015-03-31 Thread Tsutomu Itoh
On 2015/03/27 19:21, Zhaolei wrote:
> From: Zhao Lei 
> 
> This is another fix of no_space case.
> 
> All patchs for fix no_space bug are available at fix_no_space
> branch on:
>git://github.com/zhaoleidd/btrfs

I tested in the environment that applied the following patch to
'fix_no_space' branch.
 - Btrfs: fix find_free_dev_extent() malfunction in case device tree has hole

It works fine to me. Thanks.

Tested-by: Tsutomu Itoh 

> 
> Any suggestions are welcome.
> 
> Zhao Lei (1):
>btrfs: wait for delayed iputs on no space
> 
>   fs/btrfs/extent-tree.c | 3 +++
>   1 file changed, 3 insertions(+)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARNING at fs/btrfs/super.c:260 __btrfs_abort_transaction (error -17)

2015-03-31 Thread Sophie Dexter

On 30/03/2015 22:21, Chris Mason wrote:



On Mon, Mar 30, 2015 at 10:05 AM, Sophie Dexter
 wrote:

On 24/03/15 17:34, Chris Mason wrote:

You have great timing, there are two reports of a very similar abort
with 4.0-rc5, but your report makes it clear these are not a regression
from 4.0-rc4.

Are you able to run btrfsck on this filesystem?  I'd like to check for
metadata inconsistencies.

-chris


Hi Chris,

Haha, great timing is the secret of good comedy lol

OpenWrt has only very recently signed off the 3.18 kernel as the default
kernel for my router, I was using a build with 3.14 when I converted my
disk and saw the same problem :!: I may have posted something I haven't
repeated here in the OpenWrt ticket I opened:

https://dev.openwrt.org/ticket/19216

I previously checked and scrubbed the disk when the problem first
occurred and happily no problems were found then. Although, I had to use
another computer because btrfs check doesn't complete on my router, the
process is killed due to lack of memory (btrfs invoked oom-killer) :-(
Should I start another topic for this or just accept that that problem
is due to a lack of memory?

I have just run btrfs check again using (yet another) laptop and I think
everything is still OK:

# btrfs check /dev/sdb1
Checking filesystem on /dev/sdb1
UUID: ----
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 930516788539 bytes used err is 0
total csum bytes: 1234353920
total tree bytes: 1458515968
total fs tree bytes: 54571008
total extent tree bytes: 66936832
btree space waste bytes: 73372568
file data blocks allocated: 1264250781696
  referenced 1264250781696
Btrfs v3.14.1
# uname -a
Linux ##-- 3.16.0-31-generic #43-Ubuntu SMP Tue Mar 10
17:37:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Kind regards,
Sophie x



I want to continue to use BTRFS as far as possible and have moved this
disk to a Raspberry Pi now because of the problems I encountered with
it when it was plugged into my router. I'd rather do this than convert
back to ext3/4. I'm using Open Media Vault on my Paspberry Pi and,
fingers crossed, haven't had any problems over the weekend.

I can only guess, given it's inability to complete a btrfs check, that
a router doesn't have enough memory for BTRFS. I'm happy to move my
disk back to my router to try things out and help develop BTRFS for
small computers, but for now at least it has a new home.


I have an image that can reproduce this bug, and I'm trying to figure
out where we've gone wrong.  Hopefully end of day tomorrow I'll have
more ideas, but its a run time error related to extent management. Since
the FS check is clean, the FS itself isn't corrupt.

-chris


Hi Chris,

That sounds promising. I'll try my disk on my router again when you have 
something you want to test on a wider audience.


Sophie x

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html