Re: Replicate snapshot to second machine fails

2015-02-08 Thread cwillu
This isn't a btrfs-send or a btrfs-receive question:

$ echo hi | ssh machine.local sudo echo test
sudo: no tty present and no askpass program specified

How were you planning on providing credentials to sudo?

On Sun, Feb 8, 2015 at 9:17 AM, Thomas Schneider  wrote:
>
>
> Hi,
>
> I want to replicate a snapshot from PC1 to virtual machine using this command:
>
> user@pc1-gigabyte ~ $ sudo btrfs send 
> "/home/.snapshots/lmde_home_2015-02-07_02:38:21" | ssh vm1-debian sudo btrfs 
> receive /home/.snapshots/
> At subvol /home/.snapshots/lmde_home_2015-02-07_02:38:21
> sudo: Kein TTY vorhanden und kein »askpass«-Programm angegeben
>
> Unfortunately I cannot detect the root cause for the failure.
> Any ideas?
>
> On both machines I have installed programs ssh-askpass and ssh-askpass-gnome.
>
>
> user@pc1-gigabyte ~ $ uname -a
> Linux pc1-gigabyte 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) 
> i686 GNU/Linux
> user@pc1-gigabyte ~ $ sudo btrfs --version
> Btrfs v3.17
> user@pc1-gigabyte ~ $ sudo btrfs fi show
> Label: none  uuid: 236fe36a-3187-4955-977d-f22cd818c424
> Total devices 1 FS bytes used 103.96GiB
> devid1 size 147.00GiB used 147.00GiB path /dev/sda5
> Btrfs v3.17
> user@pc1-gigabyte ~ $ sudo btrfs fi df /home
> Data, single: total=142.97GiB, used=103.03GiB
> System, DUP: total=8.00MiB, used=16.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, DUP: total=2.00GiB, used=957.17MiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=320.00MiB, used=0.00B
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH v2] mount.btrfs helper

2014-11-30 Thread cwillu
Sorry, misread "initrdless" as "initramfs".

In #btrfs, I usually say something like "do you gain enough by not
using an initfs for this to be worth the hassle?", but of course,
that's not an argument against making mount smarter.

On Sun, Nov 30, 2014 at 4:57 PM, Dimitri John Ledkov  wrote:
> On 30 November 2014 at 22:31, cwillu  wrote:
>>
>> In ubuntu, the initfs runs a btrfs dev scan, which should catch
>> anything that would be missed there.
>>
>
> I'm sorry, udev rule(s) is not sufficient in the initramfs-less case,
> as outlined.
>
> In case of booting with initramfs, indeed, both Debian & Ubuntu
> include snippets there to run btrfs scan.
>
> --
> Regards,
>
> Dimitri.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH v2] mount.btrfs helper

2014-11-30 Thread cwillu
In ubuntu, the initfs runs a btrfs dev scan, which should catch
anything that would be missed there.

On Sun, Nov 30, 2014 at 4:11 PM, Dimitri John Ledkov  wrote:
> Hello,
>
> On 30 November 2014 at 17:43, Goffredo Baroncelli  wrote:
>> Hi all,
>>
>> this patch provides a "mount.btrfs" helper for the mount command.
>> A btrfs filesystem could span several disks. This helper scans all the
>> partitions to discover all the disks required to mount a filesystem.
>> So it would not necessary any-more to "scan" the partitions to mount a 
>> filesystem.
>>
>
> I would welcome this, as a general idea. At the moment in debian &
> ubuntu, btrfs tools package ships udev rules to call "btrfs scan"
> whenever device nodes appear.
>
> If scan is built into mount, I would be able to drop that udev rule.
> There are also some reports (not yet re-verified) that such udev rule
> is not effective, that is btrfs mount fails when attempted before udev
> has attempted to be run - e.g. from initrdless boot trying to mount
> btrfs systems before udev-trigger has been run (to process "cold-plug"
> events).
>
> --
> Regards,
>
> Dimitri.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: TEST PING

2014-10-12 Thread cwillu
On Sun, Oct 12, 2014 at 2:45 PM, royy walls  wrote:
>
> --

http://www.tux.org/lkml/#s3

"Test" messages are very, very inappropriate on the lkml or any other
list, for that matter. If you want to know whether the subscribe
succeeded, wait for a couple of hours after you get a reply from the
mailing list software saying it did. You'll undoubtedly get a number
of list messages. If you want to know whether you can post, you must
have something important to say, right? After you have read the
following paragraphs, compose a real letter, not a test message, in an
editor, saving the body of the letter in the off chance your post
doesn't succeed. Then post your letter to lkml. Please remember that
there are quite a number of subscribers, and it will take a while for
your letter to be reflected back to you. An hour is not too long to
wait.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is the vision for btrfs fs repair?

2014-10-10 Thread cwillu
If -o recovery is necessary, then you're either running into a btrfs
bug, or your hardware is lying about when it has actually written
things to disk.

The first case isn't unheard of, although far less common than it used
to be, and it should continue to improve with time.

In the second case, you're potentially screwed regardless of the
filesystem, without doing hacks like "wait a good long time before
returning from fsync in the hopes that the disk might actually have
gotten around to performing the write it said had already finished."

On Fri, Oct 10, 2014 at 5:12 AM, Bob Marley  wrote:
> On 10/10/2014 12:59, Roman Mamedov wrote:
>>
>> On Fri, 10 Oct 2014 12:53:38 +0200
>> Bob Marley  wrote:
>>
>>> On 10/10/2014 03:58, Chris Murphy wrote:
>
> * mount -o recovery
> "Enable autorecovery attempts if a bad tree root is found at
> mount time."

 I'm confused why it's not the default yet. Maybe it's continuing to
 evolve at a pace that suggests something could sneak in that makes things
 worse? It is almost an oxymoron in that I'm manually enabling an
 autorecovery

 If true, maybe the closest indication we'd get of btrfs stablity is the
 default enabling of autorecovery.
>>>
>>> No way!
>>> I wouldn't want a default like that.
>>>
>>> If you think at distributed transactions: suppose a sync was issued on
>>> both sides of a distributed transaction, then power was lost on one
>>> side
>>
>> What distributed transactions? Btrfs is not a clustered filesystem[1], it
>> does
>> not support and likely will never support being mounted from multiple
>> hosts at
>> the same time.
>>
>> [1]http://en.wikipedia.org/wiki/Clustered_file_system
>>
>
> This is not the only way to do a distributed transaction.
> Databases can be hosted on the filesystem, and those can do distributed
> transations.
> Think of two bank accounts, one on btrfs fs1 here, and another bank account
> on database on a whatever filesystem in another country. You want to debit
> one account and credit the other one: the filesystems at the two sides *must
> not rollback their state* !! (especially not transparently without human
> intervention)
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] btrfs-progs: Add simple python front end to the search ioctl

2014-09-23 Thread cwillu
Damn you gmail...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] btrfs-progs: Add simple python front end to the search ioctl

2014-09-23 Thread cwillu
On Tue, Sep 23, 2014 at 10:39 AM, Chris Mason  wrote:
>
> This is a starting point for a debugfs style python interface using
> the search ioctl.  For now it can only do one thing, which is to
> print out all the extents in a file and calculate the compression ratio.
>
> Over time it will grow more features, especially for the kinds of things
> we might run btrfs-debug-tree to find out.  Expect the usage and output
> to change dramatically over time (don't hard code to it).
>
> Signed-off-by: Chris Mason 
> ---
>  btrfs-debugfs | 296 
> ++
>  1 file changed, 296 insertions(+)
>  create mode 100755 btrfs-debugfs
>
> diff --git a/btrfs-debugfs b/btrfs-debugfs
> new file mode 100755
> index 000..cf1d285
> --- /dev/null
> +++ b/btrfs-debugfs
> @@ -0,0 +1,296 @@
> +#!/usr/bin/env python2
> +#
> +# Simple python program to print out all the extents of a single file
> +# LGPLv2 license
> +# Copyright Facebook 2014
> +
> +import sys,os,struct,fcntl,ctypes,stat
> +
> +# helpers for max ints
> +maxu64 = (1L << 64) - 1
> +maxu32 = (1L << 32) - 1
> +
> +# the inode (like form stat)
> +BTRFS_INODE_ITEM_KEY = 1
> +# backref to the directory
> +BTRFS_INODE_REF_KEY = 12
> +# backref to the directory v2
> +BTRFS_INODE_EXTREF_KEY = 13
> +# xattr items
> +BTRFS_XATTR_ITEM_KEY = 24
> +# orphans for list files
> +BTRFS_ORPHAN_ITEM_KEY = 48
> +# treelog items for dirs
> +BTRFS_DIR_LOG_ITEM_KEY = 60
> +BTRFS_DIR_LOG_INDEX_KEY = 72
> +# dir items and dir indexes both hold filenames
> +BTRFS_DIR_ITEM_KEY = 84
> +BTRFS_DIR_INDEX_KEY = 96
> +# these are the file extent pointers
> +BTRFS_EXTENT_DATA_KEY = 108
> +# csums
> +BTRFS_EXTENT_CSUM_KEY = 128
> +# root item for subvols and snapshots
> +BTRFS_ROOT_ITEM_KEY = 132
> +# root item backrefs
> +BTRFS_ROOT_BACKREF_KEY = 144
> +BTRFS_ROOT_REF_KEY = 156
> +# each allocated extent has an extent item
> +BTRFS_EXTENT_ITEM_KEY = 168
> +# optimized extents for metadata only
> +BTRFS_METADATA_ITEM_KEY = 169
> +# backrefs for extents
> +BTRFS_TREE_BLOCK_REF_KEY = 176
> +BTRFS_EXTENT_DATA_REF_KEY = 178
> +BTRFS_EXTENT_REF_V0_KEY = 180
> +BTRFS_SHARED_BLOCK_REF_KEY = 182
> +BTRFS_SHARED_DATA_REF_KEY = 184
> +# one of these for each block group
> +BTRFS_BLOCK_GROUP_ITEM_KEY = 192
> +# dev extents records which part of each device is allocated
> +BTRFS_DEV_EXTENT_KEY = 204
> +# dev items describe devs
> +BTRFS_DEV_ITEM_KEY = 216
> +# one for each chunk
> +BTRFS_CHUNK_ITEM_KEY = 228
> +# qgroup info
> +BTRFS_QGROUP_STATUS_KEY = 240
> +BTRFS_QGROUP_INFO_KEY = 242
> +BTRFS_QGROUP_LIMIT_KEY = 244
> +BTRFS_QGROUP_RELATION_KEY = 246
> +# records balance progress
> +BTRFS_BALANCE_ITEM_KEY = 248
> +# stats on device errors
> +BTRFS_DEV_STATS_KEY = 249
> +BTRFS_DEV_REPLACE_KEY = 250
> +BTRFS_STRING_ITEM_KEY = 253
> +
> +# in the kernel sources, this is flattened
> +# btrfs_ioctl_search_args_v2.  It includes both the btrfs_ioctl_search_key
> +# and the buffer.  We're using a 64K buffer size.
> +#
> +args_buffer_size = 65536
> +class btrfs_ioctl_search_args(ctypes.Structure):

Put comments like these in triple-quoted strings just inside the class
or function you're defining; this makes them accessible using the
standard help() system:

class foo(bar):
"""
In the kernel sources, this is

> +_pack_ = 1
> +_fields_ = [ ("tree_id", ctypes.c_ulonglong),
> + ("min_objectid", ctypes.c_ulonglong),
> + ("max_objectid", ctypes.c_ulonglong),
> + ("min_offset", ctypes.c_ulonglong),
> + ("max_offset", ctypes.c_ulonglong),
> + ("min_transid", ctypes.c_ulonglong),
> + ("max_transid", ctypes.c_ulonglong),
> + ("min_type", ctypes.c_uint),
> + ("max_type", ctypes.c_uint),
> + ("nr_items", ctypes.c_uint),
> + ("unused", ctypes.c_uint),
> + ("unused1", ctypes.c_ulonglong),
> + ("unused2", ctypes.c_ulonglong),
> + ("unused3", ctypes.c_ulonglong),
> + ("unused4", ctypes.c_ulonglong),
> + ("buf_size", ctypes.c_ulonglong),
> + ("buf", ctypes.c_ubyte * args_buffer_size),
> +   ]
> +
> +# the search ioctl resturns one header for each item


> +class btrfs_ioctl_search_header(ctypes.Structure):
> +_pack_ = 1
> +_fields_ = [ ("transid", ctypes.c_ulonglong),
> + ("objectid", ctypes.c_ulonglong),
> + ("offset", ctypes.c_ulonglong),
> + ("type", ctypes.c_uint),
> + ("len", ctypes.c_uint),
> +   ]
> +
> +# the type field in btrfs_file_extent_item
> +BTRFS_FILE_EXTENT_INLINE = 0
> +BTRFS_FILE_EXTENT_REG = 1
> +BTRFS_FILE_EXTENT_PREALLOC = 2
> +
> +class btrfs_file_extent_item(ctypes.LittleEndianStructure):
> +_pack_ = 1
> +_fields_ = [ ("generation", ctypes.c_ulonglong),
> + ("ram_bytes", ctypes.c_ul

Re: [systemd-devel] Slow startup of systemd-journal on BTRFS

2014-06-16 Thread cwillu
It's not a mmap problem, it's a small writes with an msync or fsync
after each one problem.

For the case of sequential writes (via write or mmap), padding writes
to page boundaries would help, if the wasted space isn't an issue.
Another approach, again assuming all other writes are appends, would
be to periodically (but frequently enough that the pages are still in
cache) read a chunk of the file and write it back in-place, with or
without an fsync. On the other hand, if you can afford to lose some
logs on a crash, not fsyncing/msyncing after each write will also
eliminate the fragmentation.

(Worth pointing out that none of that is conjecture, I just spent 30
minutes testing those cases while composing this ;p)

Josef has mentioned in irc that a piece of Chris' raid5/6 work will
also fix this when it lands.

On Mon, Jun 16, 2014 at 1:52 PM, Martin  wrote:
> On 16/06/14 17:05, Josef Bacik wrote:
>>
>> On 06/16/2014 03:14 AM, Lennart Poettering wrote:
>>> On Mon, 16.06.14 10:17, Russell Coker (russ...@coker.com.au) wrote:
>>>
> I am not really following though why this trips up btrfs though. I am
> not sure I understand why this breaks btrfs COW behaviour. I mean,
>
 I don't believe that fallocate() makes any difference to
 fragmentation on
 BTRFS.  Blocks will be allocated when writes occur so regardless of an
 fallocate() call the usage pattern in systemd-journald will cause
 fragmentation.
>>>
>>> journald's write pattern looks something like this: append something to
>>> the end, make sure it is written, then update a few offsets stored at
>>> the beginning of the file to point to the newly appended data. This is
>>> of course not easy to handle for COW file systems. But then again, it's
>>> probably not too different from access patterns of other database or
>>> database-like engines...
>
> Even though this appears to be a problem case for btrfs/COW, is there a
> more favourable write/access sequence possible that is easily
> implemented that is favourable for both ext4-like fs /and/ COW fs?
>
> Database-like writing is known 'difficult' for filesystems: Can a data
> log can be a simpler case?
>
>
>> Was waiting for you to show up before I said anything since most systemd
>> related emails always devolve into how evil you are rather than what is
>> actually happening.
>
> Ouch! Hope you two know each other!! :-P :-)
>
>
> [...]
>> since we shouldn't be fragmenting this badly.
>>
>> Like I said what you guys are doing is fine, if btrfs falls on it's face
>> then its not your fault.  I'd just like an exact idea of when you guys
>> are fsync'ing so I can replicate in a smaller way.  Thanks,
>
> Good if COW can be so resilient. I have about 2GBytes of data logging
> files and I must defrag those as part of my backups to stop the system
> fragmenting to a stop (I use "cp -a" to defrag the files to a new area
> and restart the data software logger on that).
>
>
> Random thoughts:
>
> Would using a second small file just for the mmap-ed pointers help avoid
> repeated rewriting of random offsets in the log file causing excessive
> fragmentation?
>
> Align the data writes to 16kByte or 64kByte boundaries/chunks?
>
> Are mmap-ed files a similar problem to using a swap file and so should
> the same "btrfs file swap" code be used for both?
>
>
> Not looked over the code so all random guesses...
>
> Regards,
> Martin
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [systemd-devel] Slow startup of systemd-journal on BTRFS

2014-06-15 Thread cwillu
Fallocate is a red herring except insofar as it's a hint that btrfs
isn't making much use of: you see the same behaviour with small writes
to an mmap'ed file that's msync'ed after each write, and likewise with
plain old appending small writes with an fsync after each write, with
or without fallocating the file first.  Looking at the fiemap output
while doing either of those, you'll see a new 4k extent being made,
and then the physical location of that extent will increment until the
writes move on to the next 4k extent.

cwillu@cwillu-home:~/work/btrfs/e2fs$ touch /tmp/test

>>> f=open('/tmp/test', 'r+')
>>> m=mmap.mmap(f.fileno(), size)
>>> for x in xrange(size):
...   m[x] = " "
...   m.flush(x / 4096 * 4096, 4096)   # msync(self->data + offset,
size, MS_SYNC)) {

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
start: 0, length: 80
fs_ioc_fiemap 3223348747d
File /tmp/test has 3 extents:
#   Logical  Physical Length   Flags
0:   000b3d9c 1000 
1:  1000 00069f012000 0000003ff000 
2:  0040 000b419d1000 0040 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b3daf3000 1000 
1:  1000 00069f012000 0000003ff000 
2:  0040 000b419d1000 0040 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b3dc38000 1000 
1:  1000 00069f012000 0000003ff000 
2:  0040 000b419d1000 0040 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b3dc9f000 1000 
1:  1000 000b3d2b7000 1000 
2:  2000 00069f013000 0000003fe000 
3:  0040 000b419d1000 0040 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat
/tmp/test -c %s)# msync(self->data + offset, size, MS_SYNC)) {
0:   000b3dc9f000 1000 
1:  1000 000b3d424000 1000 
2:  2000 00069f013000 0000003fe000 
3:  0040 000b419d1000 0040 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b3dc9f000 1000 
1:  1000 000b3d563000 1000 
2:  2000 00069f013000 003fe000 
3:  0040 000b419d1000 0040 0001



cwillu@cwillu-home:~/work/btrfs/e2fs$ rm /tmp/test
cwillu@cwillu-home:~/work/btrfs/e2fs$ touch /tmp/test

>>> f=open('/tmp/test', 'r+')
>>> f.truncate(size)
>>> m=mmap.mmap(f.fileno(), size)
>>> for x in xrange(size):
...   m[x] = " "
...   m.flush(x / 4096 * 4096, 4096)   # msync(self->data + offset,
size, MS_SYNC)) {

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
start: 0, length: 80
fs_ioc_fiemap 3223348747d
File /tmp/test has 1 extents:
#   Logical  Physical Length   Flags
0:   000b47f11000 00001000 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b48006000 1000 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b48183000 1000 
1:  1000 000b48255000 1000 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b48183000 1000 
1:  1000 000b48353000 1000 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b48183000 1000 
1:  1000 000b493ed000 00001000 
2:  2000 000b4a68f000 1000 
3:  3000 000b4b36f000 1000 0001

cwillu@cwillu-home:~/work/btrfs/e2fs$ fiemap /tmp/test 0 $(stat /tmp/test -c %s)
0:   000b48183000 1000 
1:  1000 000b493ed000 00001000 
2:  2000 000b4a68f000 1000 
3:  3000 000b4b4cf000 1000 0001



cwillu@cwillu-home:~/work/btrfs/e2fs$ rm /tmp/test
cwillu@cwillu-home:~/work/btrfs/e2fs$ touch /tmp/test

>>> f=open('/tmp/test', '

Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log

2014-04-05 Thread cwillu
On Fri, Apr 4, 2014 at 12:46 PM, Marc MERLIN  wrote:
> On Wed, Apr 02, 2014 at 04:29:35PM +0800, Qu Wenruo wrote:
>> Convert man page for btrfs-zero-log
>>
>> Signed-off-by: Qu Wenruo 
>> ---
>>  Documentation/Makefile   |  2 +-
>>  Documentation/btrfs-zero-log.txt | 39 
>> +++
>>  2 files changed, 40 insertions(+), 1 deletion(-)
>>  create mode 100644 Documentation/btrfs-zero-log.txt
>>
>> diff --git a/Documentation/Makefile b/Documentation/Makefile
>> index e002d53..de06629 100644
>> --- a/Documentation/Makefile
>> +++ b/Documentation/Makefile
>> @@ -11,7 +11,7 @@ MAN8_TXT += btrfs-image.txt
>>  MAN8_TXT += btrfs-map-logical.txt
>>  MAN8_TXT += btrfs-show-super.txt
>>  MAN8_TXT += btrfstune.txt
>> -#MAN8_TXT += btrfs-zero-log.txt
>> +MAN8_TXT += btrfs-zero-log.txt
>>  #MAN8_TXT += fsck.btrfs.txt
>>  #MAN8_TXT += mkfs.btrfs.txt
>>
>> diff --git a/Documentation/btrfs-zero-log.txt 
>> b/Documentation/btrfs-zero-log.txt
>> new file mode 100644
>> index 000..e3041fa
>> --- /dev/null
>> +++ b/Documentation/btrfs-zero-log.txt
>> @@ -0,0 +1,39 @@
>> +btrfs-zero-log(8)
>> +=
>> +
>> +NAME
>> +
>> +btrfs-zero-log - clear out log tree
>> +
>> +SYNOPSIS
>> +
>> +'btrfs-zero-log' 
>> +
>> +DESCRIPTION
>> +---
>> +'btrfs-zero-log' will remove the log tree if log tree is corrupt, which will
>> +allow you to mount the filesystem again.
>> +
>> +The common case where this happens has been fixed a long time ago,
>> +so it is unlikely that you will see this particular problem.
>
> A note on this one: this can happen if your SSD rites things in the
> wrong order or potentially writes garbage when power is lost, or before
> locking up.
> I hit this problem about 10 times and it wasn't a btrfs bug, just the
> drive doing bad things.

And -o recovery didn't work around it?  My understanding is that -o
recovery will skip reading the log.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a brtfs filesystem < 70M?

2014-03-10 Thread cwillu
Have you tried the -M option to mkfs.btrfs?  I'm not sure if we select
it automatically (or if we do, whether you have recent enough tools to
have that).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Repost] Is BTRFS "bedup" maintained ?

2014-03-05 Thread cwillu
Bedup was/is a third-party project, not sure if its developer follows this list.


Might be worth filing a bug or otherwise poking the author on
https://github.com/g2p/bedup

On Wed, Mar 5, 2014 at 2:43 PM, Marc MERLIN  wrote:
> On Wed, Mar 05, 2014 at 06:24:40PM +0100, Swāmi Petaramesh wrote:
>> Hello,
>>
>> (Not having received a single answer, I repost this...)
>
> I got your post, and posted myself about bedup not working at all for me,
> and got no answer either.
>
> As far as I can tell, it's entirely unmaintained and was likely just a proof
> of concept until the kernel can do it itself and that's not entirely
> finished from what I understand.
>
> It's a bit disappointing, but hopefully it'll get fixed eventually.
>
> Marc
> --
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems 
>    what McDonalds is to gourmet 
> cooking
> Home page: http://marc.merlins.org/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: No space left on device (again)

2014-02-25 Thread cwillu
Try btrfs filesystem balance start -dusage=15 /home, and gradually
increase it until you see it relocate at least one chunk.

On Tue, Feb 25, 2014 at 2:27 PM, Marcus Sundman  wrote:
> On 25.02.2014 22:19, Hugo Mills wrote:
>>
>> On Tue, Feb 25, 2014 at 01:05:51PM -0500, Jim Salter wrote:
>>>
>>> 370GB of 410GB used isn't really "fine", it's over 90% usage.
>>>
>>> That said, I'd be interested to know why btrfs fi show /dev/sda3
>>> shows 412.54G used, but btrfs fi df /home shows 379G used...
>>
>> This is an FAQ...
>>
>> btrfs fi show tells you how much is allocated out of the available
>> pool on each disk. btrfs fi df then shows how much of that allocated
>> space (in each category) is used.
>
>
> What is the difference between the "used 371.11GB" and the "used 412.54GB"
> displayed by "btrfs fi show"?
>
>
>> The problem here is also in the FAQ: the metadata is close to full
>> -- typically something like 500-750 MiB of headroom is needed in
>> metadata. The FS can't allocate more metadata because it's allocated
>> everything already (total=used in btrfs fi show), so the solution is
>> to do a filtered balance:
>>
>> btrfs balance start -dusage=5 /mountpoint
>
>
> Of course that was the first thing I tried, and it didn't help *at* *all*:
>
>> # btrfs filesystem balance start -dusage=5 /home
>> Done, had to relocate 0 out of 415 chunks
>> #
>
>
> ... and it really didn't free anything.
>
>
>>> On 02/25/2014 11:49 AM, Marcus Sundman wrote:

 Hi

 I get "No space left on device" and it is unclear why:

> # df -h|grep sda3
> /dev/sda3   413G  368G   45G  90% /home
> # btrfs filesystem show /dev/sda3
> Label: 'home'  uuid: 46279061-51f4-40c2-afd0-61d6faab7f60
> Total devices 1 FS bytes used 371.11GB
> devid1 size 412.54GB used 412.54GB path /dev/sda3
>
> Btrfs v0.20-rc1
> # btrfs filesystem df /home
> Data: total=410.52GB, used=369.61GB
> System: total=4.00MB, used=64.00KB
> Metadata: total=2.01GB, used=1.50GB
> #

 So, 'data' and 'metadata' seem to be fine(?), but 'system' is a
 bit low. Is that it? If so, can I do something about it? Or should
 I look somewhere else?

 I really wish I could get a warning before running out of disk
 space, instead of everything breaking suddenly when there seems to
 be lots and lots of space left.

 - Marcus

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about df and btrfs fi df

2014-02-10 Thread cwillu
On Mon, Feb 10, 2014 at 7:02 PM, Roger Binns  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 10/02/14 10:24, cwillu wrote:
>> The regular df data used number should be the amount of space required
>> to hold a backup of that content (assuming that the backup maintains
>> reflinks and compression and so forth).
>>
>> There's no good answer for available space;
>
> I think the flipside of the above works well.  How large a group of files
> can you expect to create before you will get ENOSPC?
>
> That for example is the check code does that looks at df - "I need to put
> in XGB of files - will it fit?"  It is also what users do.

But the answer changes dramatically depending on whether it's large
numbers of small files or a small number of large files, and the
conservative worst-case choice means we report a number that is half
what is probably expected.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about df and btrfs fi df

2014-02-10 Thread cwillu
> In the past [1] I proposed the following approach.
>
> $ sudo btrfs filesystem df /mnt/btrfs1/
> Disk size:   400.00GB
> Disk allocated:8.04GB
> Disk unallocated:391.97GB
> Used: 11.29MB
> Free (Estimated):250.45GB   (Max: 396.99GB, min: 201.00GB)
> Data to disk ratio:  63 %

Note that a big chunk of the problem is "what do we do with the
regular system df output".  I don't mind this as a btrfs fi df summary
though.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about df and btrfs fi df

2014-02-10 Thread cwillu
>> IMO, used should definitely include metadata, especially given that we
>> inline small files.
>>
>> I can convince myself both that this implies that we should roll it
>> into b_avail, and that we should go the other way and only report the
>> actual used number for metadata as well, so I might just plead
>> insanity here.
>
> I could be convinced to do this.  So we have
>
> total: (total disk bytes) / (raid multiplier)
> used: (total used in data block groups) +
> (total used in metadata block groups)
> avail: total - (total used in data block groups +
> total metadata block groups)
>
> That seems like the simplest to code up.  Then we can argue about whether to
> use the total metadata size or just the used metadata size for b_avail.
> Seem reasonable?

I can't think of any situations where this results in tears.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about df and btrfs fi df

2014-02-10 Thread cwillu
IMO, used should definitely include metadata, especially given that we
inline small files.

I can convince myself both that this implies that we should roll it
into b_avail, and that we should go the other way and only report the
actual used number for metadata as well, so I might just plead
insanity here.

On Mon, Feb 10, 2014 at 12:28 PM, Josef Bacik  wrote:
>
>
> On 02/10/2014 01:24 PM, cwillu wrote:
>>
>> I concur.
>>
>> The regular df data used number should be the amount of space required
>> to hold a backup of that content (assuming that the backup maintains
>> reflinks and compression and so forth).
>>
>> There's no good answer for available space; the statfs syscall isn't
>> rich enough to cover all the bases even in the face of dup metadata
>> and single data (i.e., the common case), and a truly conservative
>> estimate (report based on the highest-usage raid level in use) would
>> report space/2 on that same common case.  "Highest-usage data raid
>> level in use" is probably the best compromise, with a big warning that
>> that many large numbers of small files will not actually fit, posted
>> in some mythical place that users look.
>>
>> I would like to see the information from btrfs fi df and btrfs fi show
>> summarized somewhere (ideally as a new btrfs fi df output), as both
>> sets of numbers are really necessary, or at least have btrfs fi df
>> include the amount of space not allocated to a block group.
>>
>> Re regular df: are we adding space allocated to a block group (raid1,
>> say) but not in actual use in a file as the N/2 space available in the
>> block group, or the N space it takes up on disk?  This probably
>> matters a bit less than it used to, but if it's N/2, that leaves us
>> open to "empty filesystem, 100GB free, write a 80GB file and then
>> delete it, wtf, only 60GB free now?" reporting issues.
>>
>
> The only case we add the actual allocated chunk space is for metadata, for
> data we only add the actual used number.  So say say you write 80gb file and
> then delete it but during the writing we allocated a 1 gig chunk for
> metadata you'll see only 99gb free, make sense?  We could (should?) roll
> this into the b_avail magic and make "used" really only reflect data usage,
> opinions on this?  Thanks,
>
> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What to do about df and btrfs fi df

2014-02-10 Thread cwillu
I concur.

The regular df data used number should be the amount of space required
to hold a backup of that content (assuming that the backup maintains
reflinks and compression and so forth).

There's no good answer for available space; the statfs syscall isn't
rich enough to cover all the bases even in the face of dup metadata
and single data (i.e., the common case), and a truly conservative
estimate (report based on the highest-usage raid level in use) would
report space/2 on that same common case.  "Highest-usage data raid
level in use" is probably the best compromise, with a big warning that
that many large numbers of small files will not actually fit, posted
in some mythical place that users look.

I would like to see the information from btrfs fi df and btrfs fi show
summarized somewhere (ideally as a new btrfs fi df output), as both
sets of numbers are really necessary, or at least have btrfs fi df
include the amount of space not allocated to a block group.

Re regular df: are we adding space allocated to a block group (raid1,
say) but not in actual use in a file as the N/2 space available in the
block group, or the N space it takes up on disk?  This probably
matters a bit less than it used to, but if it's N/2, that leaves us
open to "empty filesystem, 100GB free, write a 80GB file and then
delete it, wtf, only 60GB free now?" reporting issues.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread cwillu
Everyone who has actually looked at what the statfs syscall returns
and how df (and everyone else) uses it, keep talking.  Everyone else,
go read that source code first.

There is _no_ combination of values you can return in statfs which
will not be grossly misleading in some common scenario that someone
cares about.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Are nocow files snapshot-aware

2014-02-06 Thread cwillu
On Thu, Feb 6, 2014 at 6:32 PM, Kai Krakow  wrote:
> Duncan <1i5t5.dun...@cox.net> schrieb:
>
>>> Ah okay, that makes it clear. So, actually, in the snapshot the file is
>>> still nocow - just for the exception that blocks being written to become
>>> unshared and relocated. This may introduce a lot of fragmentation but it
>>> won't become worse when rewriting the same blocks over and over again.
>>
>> That also explains the report of a NOCOW VM-image still triggering the
>> snapshot-aware-defrag-related pathology.  It was a _heavily_ auto-
>> snapshotted btrfs (thousands of snapshots, something like every 30
>> seconds or more frequent, without thinning them down right away), and the
>> continuing VM writes would nearly guarantee that many of those snapshots
>> had unique blocks, so the effect was nearly as bad as if it wasn't NOCOW
>> at all!
>
> The question here is: Does it really make sense to create such snapshots of
> disk images currently online and running a system. They will probably be
> broken anyway after rollback - or at least I'd not fully trust the contents.
>
> VM images should not be part of a subvolume of which snapshots are taken at
> a regular and short interval. The problem will go away if you follow this
> rule.
>
> The same applies to probably any kind of file which you make nocow - e.g.
> database files. Most of those file implement their own way of transaction
> protection or COW system, e.g. look at InnoDB files. Neither they gain
> anything from using IO schedulers (because InnoDB internally does block
> sorting and prioritizing and knows better, doing otherwise even hurts
> performance), nor they gain from file system semantics like COW (because it
> does its own transactions and atomic updates and probably can do better for
> its use case). Similar applies to disk images (imagine ZFS, NTFS, ReFS, or
> btrfs images on btrfs). Snapshots can only do harm here (the only
> "protection" use case would be to have a backup, but snapshots are no
> backups), and COW will probably hurt performance a lot. The only use case is
> taking _controlled_ snapshots - and doing it all 30 seconds is by all means
> NOT controlled, it's completely undeterministic.

If the database/virtual machine/whatever is crash safe, then the
atomic state that a snapshot grabs will be useful.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS corrupted by combination of mistreatment of hiberantion and accidental power loss.

2014-01-29 Thread cwillu
You'd have been better off to just throw away the hiberated image:
mounting the filesystem would look like any other recovery from a
crash, and would have replayed the log and committed a new
transaction, in addition to whatever other disk writes happened due to
boot logs and so forth.

In this case, I suspect you'd have been perfectly fine.

When resuming the hibernated image at that point however, the kernel
will have its own ideas about the state on disk (i.e., whatever state
it had in memory), partially undoing a subset of the changes from the
previous boot and generally making a mess of things.

That said, have you tried mounting with -o recovery yet?  I wouldn't
be surprised if btrfs-restore was also able to retrieve most of
everything.  Either way, I'd be suspicious of the filesystem, and
would look to restore from backup to a fresh fs.



On Wed, Jan 29, 2014 at 8:50 AM, Adam Ryczkowski
 wrote:
> I have two independent Linux installations my notebook, both sharing the
> same btrfs partition as root file system, but installed on different
> subvolumes.
>
> I hibernated one Linux (Mint 15 64 bit). Hibernation data is stored on the
> swap file, which is used exclusively by this system.
>
> Then 2 events happened.
>
> 1) I accidentally ran the other system, which wasn't hibernated - Ubuntu
> 12.10. Realizing the problem, I waited until the system booted up, and then
> shutdowned it.
>
> Then I opened the hibernated Mint 15. Restoration went successful, and I
> never thought I am in trouble.
>
> 2) Immediately after that, by coincidence, the battery fell down, brutally
> powering down the computer.
>
> After that, I am unable to repair/mount the root btrfs partition, however I
> try (I built the current btrfs-tools from git). Dmesg displays only one
> error entry: btrfs: open_ctree failed.
>
> I know, that if one those two events happened separately, there would be no
> problem. The problem arose only when those two events happened
> simultaneously.
>
> So I guess I am experiencing one of the corner cases.
>
> What are my prospects to restoring my data? I have several subvolumes on the
> hard drive, some of them were not touched by the accident at all.
>
> Adam Ryczkowski
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and ECC RAM

2014-01-17 Thread cwillu
On Fri, Jan 17, 2014 at 6:23 PM, Ian Hinder  wrote:
> Hi,
>
> I have been reading a lot of articles online about the dangers of using ZFS 
> with non-ECC RAM.  Specifically, the fact that when good data is read from 
> disk and compared with its checksum, a RAM error can cause the read data to 
> be incorrect, causing a checksum failure, and the bad data might now be 
> written back to the disk in an attempt to correct it, corrupting it in the 
> process.  This would be exacerbated by a scrub, which could run through all 
> your data and potentially corrupt it.  There is a strong current of opinion 
> that using ZFS without ECC RAM is "suicide for your data".

That sounds entirely silly:  a scrub will only write data to the disk
that has actually passed a checksum. In order for that to corrupt
something on disk, you'd have to have a perfect storm of correct and
corrupt reads, and in every such case thta I can think of, you'd be
worse off without checksums than if you had them.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Subvolume creation returns file exists

2013-11-15 Thread cwillu
On Fri, Nov 15, 2013 at 9:27 AM, Hugo Mills  wrote:
> On Fri, Nov 15, 2013 at 02:33:58PM +, Alin Dobre wrote:
>> We are using btrfs filesystems in our infrastructure and, at some
>> point of time, they start refusing to create new subvolumes.
>>
>> Each file system is being quota initialized immediately after its
>> creation (with "btrfs quota enable") and then all subfolders under
>> the root directory are created as subvolumes (btrfs subvolume
>> create). Over time, these subvolumes may also be deleted. What's
>> under subvolumes are just various files and directories, should not
>> be related to this problem.
>>
>> After a while of using this setup, without any obvious steps to
>> reproduce it, the filesystem goes into a state where the following
>> happens:
>> # btrfs subvolume create btrfs_mount/test_subvolume
>> Create subvolume 'btrfs_mount/test_subvolume'
>> ERROR: cannot create subvolume - File exists
>
>We've had someone else with this kind of symptom (snapshot/subvol
> creation fails unexpectedly) on IRC recently. I don't think they've
> got to the bottom of it yet, but the investigation is ongoing. I've
> cc'd Carey in on this, because he was the one trying to debug it.
>
>Hugo.
>
>> In regards to data, the filesystem is pretty empty, it only has a
>> single empty directory. I don't know about the metadata, at this
>> point.
>>
>> The problem goes away if we disable and re-enable the quota. It all
>> seems to be some dead metadata lying around.

And indeed, it turns out I did have quotas enabled, and disabling them
restores the ability to create subvolumes.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Odp: Re: Odp: Btrfs might be gradually slowing the boot process

2013-11-08 Thread cwillu
On Fri, Nov 8, 2013 at 1:46 PM, Hugo Mills  wrote:
> On Fri, Nov 08, 2013 at 08:37:37PM +0100, y...@wp.pl wrote:
>> Sure;
>>
>> the kernel line from grub.cfg:
>> linux   /boot/vmlinuz-linux root=UUID=c26e6d9a-0bbb-436a-a217-95c738b5b9c6 
>> rootflags=noatime,space_cache rw quiet
>
>OK, this may be your problem. You're generating the space cache
> every time you boot. You only need it once; let the disk activity on
> boot finish (it may take a while, depending on how big your filesystem
> is, and how much data it has, and how fragmented it is), and remove
> the space_cache option from your rootflags. When you next boot, it
> will use the existing cache rather than generating it again from
> scratch.

While it's true you only need to mount with it once, mounting with
space_cache will only generate it if it doesn't already exist.  The
existence of a valid space cache generation in the super actually
enables exactly the same flag that space_cache/no_space_cache toggles,
in the very same function as the mount option is checked (this is
basically how the "you only need to mount with it once" magic is
implemented).

super.c:
int btrfs_parse_options(struct btrfs_root *root, char *options)
{
...
cache_gen = btrfs_super_cache_generation(root->fs_info->super_copy);
if (cache_gen)
btrfs_set_opt(info->mount_opt, SPACE_CACHE);
...
case Opt_space_cache:
btrfs_set_opt(info->mount_opt, SPACE_CACHE);
break;
...
}
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and default directory link count

2013-11-08 Thread cwillu
On Fri, Nov 8, 2013 at 5:07 AM, Andreas Schneider  wrote:
> Hello,
>
> I did run the Samba testsuite and have a failing test
> (samba.vfstest.stream_depot). It revealed that it only fails on btrfs. The
> reason is that a simple check fails:
>
> if (smb_fname_base->st.st_ex_nlink == 2)
>
> If you create a directory on btrfs and check stat:
>
> $ mkdir x
> $ stat x
>   File: ‘x’
>   Size: 0   Blocks: 0  IO Block: 4096   directory
> Device: 2bh/43d Inode: 3834720 Links: 1
> Access: (0755/drwxr-xr-x)  Uid: ( 1000/ asn)   Gid: (  100/   users)
> Access: 2013-11-08 11:54:32.431040963 +0100
> Modify: 2013-11-08 11:54:32.430040956 +0100
> Change: 2013-11-08 11:54:32.430040956 +0100
>  Birth: -
>
> then you see Links: 1. On ext4 or other filesystems:
>
> mkdir x
> stat x
>   File: ‘x’
>   Size: 4096Blocks: 8  IO Block: 4096   directory
> Device: fd00h/64768dInode: 8126886 Links: 2
> Access: (0755/drwxr-xr-x)  Uid: ( 1000/ asn)   Gid: (  100/   users)
> Access: 2013-11-08 11:54:55.428212340 +0100
> Modify: 2013-11-08 11:54:55.427212319 +0100
> Change: 2013-11-08 11:54:55.427212319 +0100
>  Birth: -
>
> the link count for a directory differs: Links: 2.
>
> Why is btrfs different here? Could someone explain this?

As I understand it, inferring the number of directory entries from
st_nlink is an optimization that isn't universally valid. If that
count is 1, it must be considered invalid, and programs that don't
handle this correctly are broken.  Coreutils handle this, at least...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck errors is it save to fix?

2013-11-04 Thread cwillu
On Mon, Nov 4, 2013 at 3:14 PM, Hendrik Friedel  wrote:
> Hello,
>
> the list was quite full with patches, so this might have been hidden.
> Here the complete Stack.
> Does this help? Is this what you needed?
>> [95764.899294] CPU: 1 PID: 21798 Comm: umount Tainted: GFCIO
>> 3.11.0-031100rc2-generic #201307211535

Can you reproduce the problem under the released 3.11 or 3.12?  An
-rc2 is still pretty early in the release cycle, and I wouldn't be at
all surprised if it was a bug added and fixed in a later rc.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck errors is it save to fix?

2013-11-02 Thread cwillu
> Now that I am searching, I see this in dmesg:
> [95764.899359]  [] free_fs_root+0x99/0xa0 [btrfs]
> [95764.899384]  [] btrfs_drop_and_free_fs_root+0x93/0xc0
> [btrfs]
> [95764.899408]  [] del_fs_roots+0xcf/0x130 [btrfs]
> [95764.899433]  [] close_ctree+0x146/0x270 [btrfs]
> [95764.899461]  [] btrfs_put_super+0x19/0x20 [btrfs]
> [95764.899493]  [] btrfs_kill_super+0x1a/0x90 [btrfs]

Need to see the rest of the trace this came from.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/relocation.c:1060 during rebalancing

2013-10-06 Thread cwillu
Another user has just reported this in irc on 3.11.2

kernel BUG at fs/btrfs/relocation.c:1055!
invalid opcode:  [#1] SMP
Modules linked in: ebtable_nat nf_conntrack_netbios_ns
nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6
ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables bnep
ip6table_filter ip6_tables arc4 x86_pkg_temp_thermal coretemp
kvm_intel ath9k_htc joydev ath9k_common ath9k_hw ath kvm
snd_hda_codec_hdmi mac80211 cfg80211 iTCO_wdt iTCO_vendor_support
ath3k r8169 btusb snd_hda_codec_realtek snd_hda_intel mii
snd_hda_codec snd_hwdep serio_raw snd_seq snd_seq_device mxm_wmi
snd_pcm bluetooth mei_me microcode i2c_i801 rfkill shpchp lpc_ich
mfd_core mei wmi mperf snd_page_alloc snd_timer snd soundcore uinput
btrfs libcrc32c xor zlib_deflate raid6_pq dm_crypt hid_logitech_dj
i915 crc32_pclmul crc32c_intel ghash_clmulni_intel i2c_algo_bit
drm_kms_helper drm i2c_core video
CPU: 1 PID: 564 Comm: btrfs-balance Not tainted 3.11.2-201.fc19.x86_64 #1
Hardware name: ECS Z77H2-AX/Z77H2-AX, BIOS 4.6.5 10/25/2012
task: 8807ee1c1e80 ti: 8807f1cc8000 task.ti: 8807f1cc8000
RIP: 0010:[]  []
build_backref_tree+0x1077/0x1130 [btrfs]
RSP: 0018:8807f1cc9ab8  EFLAGS: 00010246
RAX:  RBX: 8807eef77480 RCX: dead00200200
RDX: 8807f1cc9b28 RSI: 8807f1cc9b28 RDI: 8807ef5896d0
RBP: 8807f1cc9b98 R08: 8807ef5896d0 R09: 0001
R10: a01f5483 R11:  R12: 8807ef5896d0
R13: 8807ef5896c0 R14: 8807f22ee360 R15: 8807f0e62000
FS:  () GS:88081f24() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f7d97749a90 CR3: 0007e38ef000 CR4: 001407e0
Stack:
 8807f0e62580 8807eef77a80 8807ef5899e0 8807eef77780
 8807ea5ab000 8807f22ee360 8807eef777c0 8807f22ee000
 8807f0e62120 8807eef77a80 8807f0e62020 
Call Trace:
 [] relocate_tree_blocks+0x1d8/0x630 [btrfs]
 [] ? add_data_references+0x248/0x280 [btrfs]
 [] relocate_block_group+0x280/0x690 [btrfs]
 [] btrfs_relocate_block_group+0x19f/0x2e0 [btrfs]
 [] btrfs_relocate_chunk.isra.32+0x6f/0x740 [btrfs]
 [] ? btrfs_set_path_blocking+0x39/0x80 [btrfs]
 [] ? btrfs_search_slot+0x382/0x940 [btrfs]
 [] ? free_extent_buffer+0x4f/0xa0 [btrfs]
 [] btrfs_balance+0x8e7/0xe80 [btrfs]
 [] balance_kthread+0x70/0x80 [btrfs]
 [] ? btrfs_balance+0xe80/0xe80 [btrfs]
 [] kthread+0xc0/0xd0
 [] ? insert_kthread_work+0x40/0x40
 [] ret_from_fork+0x7c/0xb0
 [] ? insert_kthread_work+0x40/0x40
Code: 4c 89 f7 e8 0c 0c f9 ff 48 8b bd 58 ff ff ff e8 00 0c f9 ff 48
83 bd 38 ff ff ff 00 0f 85 1e fe ff ff 31 c0 e9 5d f0 ff ff 0f 0b <0f>
0b 48 8b 73 18 48 89 c7 e8 49 f3 01 00 48 8b 85 38 ff ff ff
RIP  [] build_backref_tree+0x1077/0x1130 [btrfs]
 RSP 

On Wed, Sep 25, 2013 at 11:26 PM, Guenther Starnberger
 wrote:
> On Wed, Sep 25, 2013 at 04:46:41PM +0200, David Sterba wrote:
>
>> 3.12-rc really? I'd like to see the stacktrace then.
>
> Yes - this also happens on 3.12-rc kernels. Here's the stacktrace for 4b97280
> (which is several commits ahead of 3.12-rc2):
>
> [  126.735598] btrfs: disk space caching is enabled
> [  126.737038] btrfs: has skinny extents
> [  144.769929] BTRFS debug (device dm-0): unlinked 1 orphans
> [  144.836240] btrfs: continuing balance
> [  153.441134] btrfs: relocating block group 1542996361216 flags 1
> [  295.780293] btrfs: found 18 extents
> [  310.107200] [ cut here ]
> [  310.108496] kernel BUG at fs/btrfs/relocation.c:1060!
> [  310.109709] invalid opcode:  [#1] PREEMPT SMP
> [  310.110268] Modules linked in: btrfs raid6_pq crc32c libcrc32c xor xts 
> gf128mul dm_crypt dm_mod usb_storage psmouse ppdev e1000 evdev pcspkr 
> serio_raw joydev microcode snd_intel8x0 snd_ac97_codec i2c_piix4 i2c_core 
> ac97_bus snd_pcm snd_page_alloc snd_timer parport_pc parport snd soundcore 
> intel_agp button battery processor ac intel_gtt ext4 crc16 mbcache jbd2 
> hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi ohci_pci 
> ata_piix ahci libahci ohci_hcd ehci_pci ehci_hcd usbcore usb_common libata 
> scsi_mod
> [  310.110268] CPU: 0 PID: 366 Comm: btrfs-balance Not tainted 
> 3.12.0-1-00083-g4b97280-dirty #1
> [  310.110268] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
> VirtualBox 12/01/2006
> [  310.110268] task: 880078b0 ti: 880078afe000 task.ti: 
> 880078afe000
> [  310.110268] RIP: 0010:[]  [] 
> build_backref_tree+0x112a/0x11d0 [btrfs]
> [  310.110268] RSP: 0018:880078affab8  EFLAGS: 00010246
> [  310.110268] RAX:  RBX: 8800784d4000 RCX: 
> 88006a2a9d90
> [  310.110268] RDX: 880078affb30 RSI: 8800784d4020 RDI: 
> 88006a2a9d80
> [  310.110268] RBP: 880078affba0 R08: 880077d07e00 R09: 
> 880078affa

Re: [PATCH] Drop unused parameter from btrfs_item_nr

2013-09-17 Thread cwillu
On Mon, Sep 16, 2013 at 8:58 AM, Ross Kirk  wrote:
> Unused parameter cleanup
>
> Ross Kirk (1):
>   btrfs: drop unused parameter from btrfs_item_nr
>
>  fs/btrfs/backref.c|2 +-
>  fs/btrfs/ctree.c  |   34 +-
>  fs/btrfs/ctree.h  |   13 ++---
>  fs/btrfs/dir-item.c   |2 +-
>  fs/btrfs/inode-item.c |2 +-
>  fs/btrfs/inode.c  |4 ++--
>  fs/btrfs/print-tree.c |2 +-
>  fs/btrfs/send.c   |4 ++--
>  8 files changed, 31 insertions(+), 32 deletions(-)
>
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Something appears to be missing...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Manual deduplication would be useful

2013-07-23 Thread cwillu
On Tue, Jul 23, 2013 at 9:47 AM, Rick van Rein  wrote:
> Hello,
>
> For over a year now, I've been experimenting with stacked filesystems as a 
> way to save on resources.  A basic OS layer is shared among Containers, each 
> of which stacks a layer with modifications on top of it.  This approach means 
> that Containers share buffer cache and loaded executables.  Concrete 
> technology choices aside, the result is rock-solid and the efficiency 
> improvements are incredible, as documented here:
>
> http://rickywiki.vanrein.org/doku.php?id=openvz-aufs
>
> One problem with this setup is updating software.  In lieu of 
> stacking-support in package managers, it is necessary to do this on a 
> per-Container basis, meaning that each installs their own versions, including 
> overwrites of the basic OS layer.  Deduplication could remedy this, but the 
> generic mechanism is known from ZFS to be fairly inefficient.
>
> Interestingly however, this particular use case demonstrates that a much 
> simpler deduplication mechanism than normally considered could be useful.  It 
> would suffice if the filesystem could check on manual hints, or 
> stack-specifying hints, to see if overlaid files share the same file 
> contents; when they do, deduplication could commence.  This saves searching 
> through the entire filesystem for every file or block written.  It might also 
> mean that the actual stacking is not needed, but instead a basic OS could be 
> cloned to form a new basic install, and kept around for this hint processing.
>
> I'm not sure if this should ideally be implemented inside the stacking 
> approach (where it would be stacking-implementation-specific) or in the 
> filesystem (for which it might be too far off the main purpose) but I thought 
> it wouldn't hurt to start a discussion on it, given that (1) filesystems 
> nowadays service multiple instances, (2) filesystems like Btrfs are based on 
> COW, and (3) deduplication is a goal but the generic mechanism could use some 
> efficiency improvements.
>
> I hope having seen this approach is useful to you!
>
> Please reply-all?  I'm not on this list.
>
> Cheers,
>  -Rick--
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

There's patches providing offline dedup (i.e., manually telling the
kernel which files to consider) floating around:
http://lwn.net/Articles/547542/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid 10 corruption from single drive failure

2013-06-29 Thread cwillu
> Making this with all 6 devices from the beginning and btrfsck doesn't
> segfault. But it also doesn't repair the system enough to make it
> mountable. ( nether does -o recover, however -o degraded works, and
> files
> are then accessible )

Not sure I entirely follow: mounting with -o degraded (not -o
recovery) is how you're supposed to mount if there's a disk missing.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: My multi-device btrfs (3*2TB) won't mount anymore.

2013-06-18 Thread cwillu
Does anything show up in dmesg when you mount?

If mount just hangs, do an alt-sysrq-w, and then post what that sends to dmesg.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid0, raid1, raid5, what to choose?

2013-06-13 Thread cwillu
On Thu, Jun 13, 2013 at 3:21 PM, Hugo Mills  wrote:
> On Thu, Jun 13, 2013 at 11:09:00PM +0200, Hendrik Friedel wrote:
>> Hello,
>>
>> I'd appreciate your recommendation on this:
>>
>> I have three hdd with 3TB each. I intend to use them as raid5 eventually.
>> currently I use them like this:
>>
>> # mount|grep sd
>> /dev/sda1 on /mnt/Datenplatte type ext4
>> /dev/sdb1 on /mnt/BTRFS/Video type btrfs
>> /dev/sdb1 on /mnt/BTRFS/rsnapshot type btrfs
>>
>> #df -h
>> /dev/sda1   2,7T  1,3T  1,3T  51% /mnt/Datenplatte
>> /dev/sdb1   5,5T  5,4T   93G  99% /mnt/BTRFS/Video
>> /dev/sdb1   5,5T  5,4T   93G  99% /mnt/BTRFS/rsnapshot
>>
>> Now, what surprises me, and here I lack memory- is that sdb appears
>> twice.. I think, I created a raid1, but how can I find out?
>
>Appearing twice in that list is more an indication that you have
> multiple subvolumes -- check the subvol= options in /etc/fstab
>
>> #/usr/local/smarthome# ~/btrfs/btrfs-progs/btrfs fi show /dev/sdb1
>> Label: none  uuid: 989306aa-d291-4752-8477-0baf94f8c42f
>> Total devices 2 FS bytes used 2.68TB
>> devid2 size 2.73TB used 2.73TB path /dev/sdc1
>> devid1 size 2.73TB used 2.73TB path /dev/sdb1
>>
>> Now, I wanted to convert it to raid0, because I lack space and
>> redundancy is not important for the Videos and the Backup, but this
>> fails:
>> ~/btrfs/btrfs-progs/btrfs fi balance start -dconvert=raid0  /mnt/BTRFS/
>> ERROR: error during balancing '/mnt/BTRFS/' - Inappropriate ioctl for device
>
>/mnt/BTRFS isn't a btrfs subvol, according to what you have listed
> above. It's a subdirectory in /mnt which is contains two subdirs
> (Video and rsnapshot) which are used as mountpoints for subvolumes.
>
>Try running the above command with /mnt/BTRFS/Video instead (or
> rsnapshot -- it doesn't matter which).
>
>> dmesg does not help here.
>>
>> Anyway: This gave me some time to think about this. In fact, as soon
>> as raid5 is stable, I want to have all three as a raid5. Will this
>> be possible with a balance command? If so: will this be possible as
>> soon as raid5 is stable, or will I have to wait longer?
>
>Yes, it's possible to convert to RAID-5 right now -- although the
> code's not settled down into its final form quite yet. Note that
> RAID-5 over two devices won't give you any space benefits over RAID-1
> over two devices. (Or any reliability benefits either).
>
>Hugo.
>
> --
> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
>   PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
>--- "Are you the man who rules the Universe?" "Well,  I ---
>   try not to."

Raid5 currently is only suitable for testing: it's known and expected
to break on power cuts, for instance.  The parity logging stuff is
waiting on the skip-list implementation you may have read about on
lwn, otherwise the performance overhead wasn't acceptable or some
such.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommended settings for SSD

2013-05-26 Thread cwillu
On Sun, May 26, 2013 at 9:16 AM, Harald Glatt  wrote:
> I don't know a better way to check than doing df -h before and
> after... If you use space_cache you have to clear_cache though to make
> the numbers be current for sure each time before looking at df.

Not sure what you're thinking of; space_cache is just a mount-time
optimization, storing and loading a memory structure to disk so that
it doesn't have to be regenerated.

As I understand it, if it's ever wrong, it's a serious bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommended settings for SSD

2013-05-24 Thread cwillu
> At the moment I am using:
> defaults,noatime,nodiratime,ssd,subvol=@home

No need to specify ssd, it's automatically detected.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hard freezes with 3.9.0 during io-intensive loads

2013-05-05 Thread cwillu
On Sun, May 5, 2013 at 10:10 AM, Kai Krakow  wrote:
> Hello list,
>
> Kai Krakow  schrieb:
>
>> I've upgraded to 3.9.0 mainly for the snapshot-aware defragging patches.
>> I'm running bedup[1] on a regular basis and it is now the third time that
>> I got back to my PC just to find it hard-frozen and I needed to use the
>> reset button.
>>
>> It looks like this happens only while running bedup on my two btrfs
>> filesystems but I'm not sure if it happens for any of the filesystems or
>> only one. This is my setup:
>>
>> # cat /etc/fstab (shortened)
>> UUID=d2bb232a-2e8f-4951-8bcc-97e237f1b536 / btrfs
>> compress=lzo,subvol=root64 0 1 # /dev/sd{a,b,c}3
>> LABEL=usb-backup /mnt/private/usb-backup btrfs noauto,compress-
>> force=zlib,subvolid=0,autodefrag,comment=systemd.automount 0 0 # external
>> usb3 disk
>>
>> # btrfs filesystem show
>> Label: 'usb-backup'  uuid: 7038c8fa-4293-49e9-b493-a9c46e5663ca
>> Total devices 1 FS bytes used 1.13TB
>> devid1 size 1.82TB used 1.75TB path /dev/sdd1
>>
>> Label: 'system'  uuid: d2bb232a-2e8f-4951-8bcc-97e237f1b536
>> Total devices 3 FS bytes used 914.43GB
>> devid3 size 927.26GB used 426.03GB path /dev/sdc3
>> devid2 size 927.26GB used 426.03GB path /dev/sdb3
>> devid1 size 927.26GB used 427.07GB path /dev/sda3
>>
>> Btrfs v0.20-rc1
>>
>> Since the system hard-freezes I have no messages from dmesg. But I suspect
>> it to be related to the defragmentation option in bedup (I've switched to
>> bedub with --defrag since 3.9.0, and autodefrag for the backup drive).
>> Just in case, I'm going to try without this option now and see if it won't
>> freeze.
>>
>> I was able to take a "physical" screenshot with a real camera of a kernel
>> backtrace one time when the freeze happened. I wonder if it is useful to
>> you and where to send it. I just don't want to upload jpegs right here to
>> the list without asking first.
>>
>> The big plus is: Altough I had to hard-reset the frozen system several
>> times now, btrfs survived the procedure without any impact (just boot
>> times increases noticeably, probably due to log-replays or something). So
>> thumbs up for the developers on that point.
>
> Thanks to the great cwillu netcat service here's my backtrace:
>
> 4,1072,17508258745,-;[ cut here ]
> 2,1073,17508258772,-;kernel BUG at fs/btrfs/ctree.c:1144!
> 4,1074,17508258791,-;invalid opcode:  [#1] SMP
> 4,1075,17508258811,-;Modules linked in: bnep bluetooth af_packet vmci(O)
> vmmon(O) vmblock(O) vmnet(O) vsock reiserfs snd_usb_audio snd_usbmidi_lib
> snd_rawmidi snd_seq_device gspca_sonixj gpio_ich gspca_main videodev
> coretemp hwmon kvm_intel kvm crc32_pclmul crc32c_intel 8250 serial_core
> lpc_ich microcode mfd_core i2c_i801 pcspkr evdev usb_storage zram(C) unix
> 4,1076,17508258966,-;CPU 0
> 4,1077,17508258977,-;Pid: 7212, comm: btrfs-endio-wri Tainted: G C O
> 3.9.0-gentoo #2 To Be Filled By O.E.M. To Be Filled By O.E.M./Z68 Pro3
> 4,1078,17508259023,-;RIP: 0010:[]  []
> __tree_mod_log_rewind+0x4c/0x121
> 4,1079,17508259064,-;RSP: 0018:8801966718e8  EFLAGS: 00010293
> 4,1080,17508259085,-;RAX: 0003 RBX: 8801ee8d33b0 RCX:
> 880196671888
> 4,1081,17508259112,-;RDX: 0a4596a4 RSI: 0eee RDI:
> 8804087be700
> 4,1082,17508259138,-;RBP: 0071 R08: 1000 R09:
> 880196671898
> 4,1083,17508259165,-;R10:  R11:  R12:
> 880406c2e000
> 4,1084,17508259191,-;R13: 8a11 R14: 8803b5aa1200 R15:
> 0001
> 4,1085,17508259218,-;FS:  () GS:88041f20()
> knlGS:
> 4,1086,17508259248,-;CS:  0010 DS:  ES:  CR0: 80050033
> 4,1087,17508259270,-;CR2: 026f0390 CR3: 01a0b000 CR4:
> 000407f0
> 4,1088,17508259297,-;DR0:  DR1:  DR2:
> 
> 4,1089,17508259323,-;DR3:  DR6: 0ff0 DR7:
> 0400
> 4,1090,17508259350,-;Process btrfs-endio-wri (pid: 7212, threadinfo
> 88019667, task 8801b82e5400)
> 4,1091,17508259383,-;Stack:
> 4,1092,17508259391,-; 8801ee8d38f0 880021b6f360 88013a5b2000
> 8a11
> 4,1093,17508259423,-; 8802d0a14000 81167606 0246
> 8801ee8d33b0
> 4,1094,17508259455,-; 880406c2e000 8801966719bf 880021b6f360
> 
> 4,1095,17508259

Re: Panic while running defrag

2013-04-29 Thread cwillu
On Mon, Apr 29, 2013 at 9:20 PM, Stephen Weinberg  wrote:
> I ran into a panic while running find -xdev | xargs brtfs fi defrag '{}'. I
> don't remember the exact command because the history was not saved. I also
> started and stopped it a few times however.
>
> The kernel logs were on a different filesystem. Here is the
> kern.log:http://fpaste.org/9383/36729191/

Apr 28 15:24:05 hotel kernel: [614592.785065] [ cut here
]
Apr 28 15:24:05 hotel kernel: [614592.785146] WARNING: at
/build/buildd-linux_3.8.5-1~experimental.1-amd64-_t_ZfP/linux-3.8.5/fs/btrfs/locking.c:46
btrfs_set_lock_blocking_rw+0x6f/0xe7 [btrfs]()
Apr 28 15:24:05 hotel kernel: [614592.785152] Hardware name:
BK169AAR-ABA HPE-210f
Apr 28 15:24:05 hotel kernel: [614592.785157] Modules linked in:
nls_utf8 nls_cp437 vfat fat cbc ecb parport_pc ppdev lp parport bnep
rfcomm bluetooth binfmt_misc nfsd auth_rpcgss nfs_acl nfs lockd
dns_resolver fscache sunrpc loop ecryptfs arc4 snd_hda_codec_hdmi
snd_hda_codec_realtek ath9k ath9k_common ath9k_hw ath mac80211
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm acpi_cpufreq
snd_page_alloc radeon mperf ttm snd_seq cfg80211 kvm_amd kvm
sp5100_tco snd_seq_device snd_timer rfkill drm_kms_helper drm snd
edac_mce_amd edac_core k10temp i2c_piix4 i2c_algo_bit soundcore
microcode i2c_core button processor psmouse evdev serio_raw
thermal_sys pcspkr ext4 crc16 jbd2 mbcache btrfs zlib_deflate crc32c
libcrc32c hid_generic usbhid hid usb_storage sg sr_mod cdrom sd_mod
crc_t10dif firewire_ohci firewire_core crc_itu_t ehci_pci ohci_hcd
ehci_hcd ahci libahci libata usbcore r8169 mii scsi_mod usb_common
Apr 28 15:24:05 hotel kernel: [614592.785279] Pid: 24757, comm: btrfs
Not tainted 3.8-trunk-amd64 #1 Debian 3.8.5-1~experimental.1
Apr 28 15:24:05 hotel kernel: [614592.785284] Call Trace:
Apr 28 15:24:05 hotel kernel: [614592.785299]  [] ?
warn_slowpath_common+0x76/0x8a
Apr 28 15:24:05 hotel kernel: [614592.785350]  [] ?
btrfs_set_lock_blocking_rw+0x6f/0xe7 [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.785386]  [] ?
btrfs_realloc_node+0xef/0x380 [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.785434]  [] ?
btrfs_defrag_leaves+0x242/0x304 [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.785479]  [] ?
btrfs_defrag_root+0x4f/0x9e [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.785526]  [] ?
btrfs_ioctl_defrag+0xb2/0x194 [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.785574]  [] ?
btrfs_ioctl+0x771/0x175a [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.785584]  [] ?
handle_mm_fault+0x1eb/0x239
Apr 28 15:24:05 hotel kernel: [614592.785594]  [] ?
__do_page_fault+0x2d7/0x375
Apr 28 15:24:05 hotel kernel: [614592.785605]  [] ?
vfs_ioctl+0x1e/0x31
Apr 28 15:24:05 hotel kernel: [614592.785613]  [] ?
do_vfs_ioctl+0x3ee/0x430
Apr 28 15:24:05 hotel kernel: [614592.785624]  [] ?
kmem_cache_free+0x44/0x80
Apr 28 15:24:05 hotel kernel: [614592.785632]  [] ?
sys_ioctl+0x4d/0x7c
Apr 28 15:24:05 hotel kernel: [614592.785642]  [] ?
system_call_fastpath+0x16/0x1b
Apr 28 15:24:05 hotel kernel: [614592.785648] ---[ end trace
2150df5c163b6833 ]---
Apr 28 15:24:05 hotel kernel: [614592.785693] [ cut here
]
Apr 28 15:24:05 hotel kernel: [614592.785813] kernel BUG at
/build/buildd-linux_3.8.5-1~experimental.1-amd64-_t_ZfP/linux-3.8.5/fs/btrfs/locking.c:265!
Apr 28 15:24:05 hotel kernel: [614592.786054] invalid opcode:  [#1] SMP
Apr 28 15:24:05 hotel kernel: [614592.786158] Modules linked in:
nls_utf8 nls_cp437 vfat fat cbc ecb parport_pc ppdev lp parport bnep
rfcomm bluetooth binfmt_misc nfsd auth_rpcgss nfs_acl nfs lockd
dns_resolver fscache sunrpc loop ecryptfs arc4 snd_hda_codec_hdmi
snd_hda_codec_realtek ath9k ath9k_common ath9k_hw ath mac80211
snd_hda_intel snd_hda_codec snd_hwdep snd_pcm acpi_cpufreq
snd_page_alloc radeon mperf ttm snd_seq cfg80211 kvm_amd kvm
sp5100_tco snd_seq_device snd_timer rfkill drm_kms_helper drm snd
edac_mce_amd edac_core k10temp i2c_piix4 i2c_algo_bit soundcore
microcode i2c_core button processor psmouse evdev serio_raw
thermal_sys pcspkr ext4 crc16 jbd2 mbcache btrfs zlib_deflate crc32c
libcrc32c hid_generic usbhid hid usb_storage sg sr_mod cdrom sd_mod
crc_t10dif firewire_ohci firewire_core crc_itu_t ehci_pci ohci_hcd
ehci_hcd ahci libahci libata usbcore r8169 mii scsi_mod usb_common
Apr 28 15:24:05 hotel kernel: [614592.788282] CPU 2
Apr 28 15:24:05 hotel kernel: [614592.788337] Pid: 24757, comm: btrfs
Tainted: GW3.8-trunk-amd64 #1 Debian
3.8.5-1~experimental.1 HP-Pavilion BK169AAR-ABA HPE-210f/ALOE
Apr 28 15:24:05 hotel kernel: [614592.788634] RIP:
0010:[]  []
btrfs_assert_tree_locked+0x7/0xa [btrfs]
Apr 28 15:24:05 hotel kernel: [614592.788899] RSP:
0018:88021b017c20  EFLAGS: 00010246
Apr 28 15:24:05 hotel kernel: [614592.789023] RAX: 
RBX: 88003396f310 RCX: 05ad05ad
Apr 28 15:24:05 hotel kernel: [614592.789186] RDX: fa56
RSI: 0046 RDI: 88003396f310
Apr 28 15:24:05 hotel kernel: [614592.789349] RBP: 0

Re: Btrfs performance problem; metadata size to blame?

2013-04-28 Thread cwillu
[how'd that send button get there]

space_cache is the default, set by mkfs, for a year or so now.  It's
sticky, so even if it wasn't, you'd only need to mount with it once.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs performance problem; metadata size to blame?

2013-04-28 Thread cwillu
On Sun, Apr 28, 2013 at 2:17 PM, Roger Binns  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 28/04/13 12:57, Harald Glatt wrote:
>> If you want better answers ...
>
> There is a lot of good information at the wiki and it does see regular
> updates.  For example the performance mount options are on this page:
>
>   https://btrfs.wiki.kernel.org/index.php/Mount_options
>
> Roger
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.12 (GNU/Linux)
>
> iEYEARECAAYFAlF9g+wACgkQmOOfHg372QQu6QCffq/cB7GPutTwiAUE0CyTuIJx
> Qj8AnjsqxVyPrK5FTDqaLk1d1lsYYB38
> =6HN3
> -END PGP SIGNATURE-
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub "correcting" tons of errors ?

2013-03-29 Thread cwillu
> Actually instead of netconsole we have an awesome service provided by Carey, 
> you
> can just do
>
> nc cwillu.com 10101 < /dev/kmsg

... at a root prompt.

> after you've run sysrq+w and then reply with the URL it spits out.  Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: minimum kernel version for btrfsprogs.0.20?

2013-03-28 Thread cwillu
On Thu, Mar 28, 2013 at 11:41 PM, Chris Murphy  wrote:
> Creating a btrfs file system using 
> btrfs-progs-0.20.rc1.20130308git704a08c-1.fc19, and either kernel 
> 3.6.10-4.fc18 or 3.9.0-0.rc3.git0.3.fc19, makes a file system that cannot be 
> mounted by kernel 3.6.10-4.fc18. It can be mounted by kernel 3.8.4. I haven't 
> tested any other 3.8, or any 3.7 kernels.
>
> Is this expected?
>
> dmesg reports:
> [  300.014764] btrfs: disk space caching is enabled
> [  300.024137] BTRFS: couldn't mount because of unsupported optional features 
> (40).
> [  300.034148] btrfs: open_ctree failed

commit 1a72afaa "btrfs-progs: mkfs support for extended inode refs"
unconditionally enables extended irefs (which permits more than 4k
links to the same inode).  It's the right default imo, but there
probably should have been a mkfs option to disable it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: question about replacing a drive in raid10

2013-03-28 Thread cwillu
On Thu, Mar 28, 2013 at 1:54 AM, Joeri Vanthienen
 wrote:
> Hi all,
>
> I have a question about replacing a drive in raid10 (and linux kernel 3.8.4).
> A bad disk was physical removed from the server. After this a new disk
> was added with "btrfs device add /dev/sdg /btrfs" to the raid10 btrfs
> FS.
> After this the server was rebooted and I mounted the filesystem in
> degraded mode. It seems that a previous started balance continued.
>
> At this point I want to remove the missing device from the pooI (btrfs
> device delete missing /btrfs). Is this safe to do ?

Yep.

> The disk usage numbers look weird to me, also the limited amount of
> data written to the new disk after the balance.

You're not actually looking at the data on the disk, but the size of
the block groups allocated on that disk.  I expect the data got spread
across all of the remaining disks, including the new one.

Probably worth running a scrub anyway though.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: No space left on device (28)

2013-03-21 Thread cwillu
On Fri, Mar 22, 2013 at 12:39 AM, Stefan Priebe - Profihost AG
 wrote:
> Already tried with value 5 did not help ;-( and it also happens with plain cp 
> copying a 15gb file and aborts at about 80%

You tried -musage=5?  Your original email said -dusage=5.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: No space left on device (28)

2013-03-21 Thread cwillu
On Fri, Mar 22, 2013 at 12:13 AM, Roman Mamedov  wrote:
> On Thu, 21 Mar 2013 20:42:28 +0100
> Stefan Priebe  wrote:
>
> I might be wrong here, but doesn't this
>
>> rsync: rename
>> "/mnt/.software/kernel/linux-3.9-rc3/drivers/infiniband/hw/amso1100/""
>> ->
>> ".software/kernel/linux-3.9-rc3/drivers/infiniband/hw/amso1100/c2_ae.h":
>
> ...try to move a file from
>
>   "/mnt/.software/"
>
> to
>
>   ".software/"
>
> (relative to current dir)??

No; that's rsync giving the full path, and then the target path
relative to the command it was given.  The filename itself
(".c2_ae.h.WEhLGP") is a semi-random filename rsync uses to write to
temporarily, so it can mv it over the original in an atomic fashion...

Stefan: ...which means that the actual copy succeeded, which suggests
that this is more of a metadata enospc thing.

You might try btrfs balance start -musage=5 (instead of -dusage), and
if that doesn't report any chunks balanced, try a high number until it
does.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to recover uncorrectable errors ?

2013-03-20 Thread cwillu
>> # rm -rf *
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> rm: cannot remove 'drivers/misc/lis3lv02d/lis3lv02d.c': Stale NFS file handle
>> ...
>
> You are trying to remove the files from an NFS client. Stale NFS file
> handle just means that the NFS handle is no longer valid. NFS  clients refer to file by a file handle composed of filesystem id and
> inode number. Maybe a change in there?
>
> Anyway, to find the real error message its necessary to try to delete
> the files on the server. Cause even if there is a real BTRFS issue, the
> NFS client likely won´t report helpful error messages.

Don't read too much into that "Stale NFS file handle" message; ESTALE
doesn't imply anything about NFS being involved, despite the standard
error string for that value.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs in multiple different disc -d sigle -m raid1 one drive failure...

2013-03-18 Thread cwillu
On Mon, Mar 18, 2013 at 12:32 PM, Jan Beranek  wrote:
> Hi all,
> I'm preparing a strorage pool for large data with quite low importance
> - there will be at least 3 hdd in "-d single" and "-m raid1"
> configuration.
>
> mkfs.btrfs -d single -m raid1 /dev/sda /dev/sdb /dec/sdc
>
> What happen if one hdd fails? Do I lost everything from all three
> discs or only data from one disc? (if from only one disc, then is it
> acceptable otherwise not...)

I just finished doing some testing to check:  It will work, kinda sorta.

You'll be forced to mount read-only, and any reads of file extents
that existed on the missing disk will return an io error.  As I
understand it, single doesn't force files to be on a single disk,
instead it _doesn't_ force them to be _several_ disks; the implication
being that a large file (say, a 4gb movie) may still end up with
pieces on each disk.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: multiple btrfsck runs

2013-03-16 Thread cwillu
On Sat, Mar 16, 2013 at 6:44 AM, Marc MERLIN  wrote:
> On Sat, Mar 16, 2013 at 06:24:47AM -0600, cwillu wrote:
>> On Sat, Mar 16, 2013 at 6:02 AM, Russell Coker  wrote:
>> > Is it expected that running btrfsck more than once will keep reporting 
>> > errors?
>>
>> Without options, btrfsck does not write to the disk.
>
> Ah, that explains why I never got it to work the day I wanted to try
> it.
> I should note that waht you're saying is neither documented in the man
> page, nor in https://btrfs.wiki.kernel.org/index.php/Btrfsck
>
> For that matter, the wiki actually states there are no options.
>
> Is that mostly intentional so that whoever isn't reading the source
> doesn't really run the tool because it's not ready?

At least a bit, yeah.  People have getting better about updating the
documentation recently though.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: multiple btrfsck runs

2013-03-16 Thread cwillu
On Sat, Mar 16, 2013 at 6:46 AM, Russell Coker  wrote:
> On Sat, 16 Mar 2013, cwillu  wrote:
>> On Sat, Mar 16, 2013 at 6:02 AM, Russell Coker  wrote:
>> > Is it expected that running btrfsck more than once will keep reporting
>> > errors?
>>
>> Without options, btrfsck does not write to the disk.
>
> The man page for the version in Debian doesn't document any options.
>
> The source indicates that --repair might be the one that's desired, is that
> correct?

Yes.  However, unless something is actually broken, or you've been
advised by a developer, I'd stick with btrfs scrub.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: multiple btrfsck runs

2013-03-16 Thread cwillu
On Sat, Mar 16, 2013 at 6:02 AM, Russell Coker  wrote:
> Is it expected that running btrfsck more than once will keep reporting errors?

Without options, btrfsck does not write to the disk.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Drive low space / huge performance hit.

2013-03-07 Thread cwillu
On Thu, Mar 7, 2013 at 11:05 AM, Steve Heyns  wrote:
> hi
>
> I am using compression lzo on my 350GB partition, I have 2 subvolumes
> on this partition. My kernel is 3.7 BTRFS v0.19 -
>
> According to my system (df -h) that partition has 75Gb available.
> According to btrfs
>
> btrfs fi df /mnt/DevSystem/
> Data: total=260.01GB, used=259.09GB
> System, DUP: total=8.00MB, used=36.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=8.00GB, used=7.87GB
> Metadata: total=8.00MB, used=0.00

Show btrfs fi show /dev/whatever
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does defragmenting even work

2013-02-28 Thread cwillu
On Thu, Feb 28, 2013 at 8:35 AM, Swâmi Petaramesh  wrote:
> BTW...
>
> I'm not even sure that "btrfs filesystem defrag " actually
> does anything...
>
> If I run "filefrag " afterwards, it typically shows the same
> number of fragments that it did prior to running defrag...
>
> I'm not sure about how it actually works and what I should expect...

Can't explain something if I can't see the data I'm explaining :p
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy on write misconception

2013-02-22 Thread cwillu
On Fri, Feb 22, 2013 at 11:41 AM, Mike Power  wrote:
> On 02/22/2013 09:16 AM, Hugo Mills wrote:
>>
>> On Fri, Feb 22, 2013 at 09:11:28AM -0800, Mike Power wrote:
>>>
>>> I think I have a misconception of what copy on write in btrfs means
>>> for individual files.
>>>
>>> I had originally thought that I could create a large file:
>>> time dd if=/dev/zero of=10G bs=1G count=10
>>> 10+0 records in
>>> 10+0 records out
>>> 10737418240 bytes (11 GB) copied, 100.071 s, 107 MB/s
>>>
>>> real1m41.082s
>>> user0m0.000s
>>> sys0m7.792s
>>>
>>> Then if I copied this file no blocks would be copied until they are
>>> written.  Hence the two files would use the same blocks underneath.
>>> But specifically that copy would be fast.  Since it would only need
>>> to write some metadata.  But when I copy the file:
>>> time cp 10G 10G2
>>>
>>> real3m38.790s
>>> user0m0.124s
>>> sys0m10.709s
>>>
>>> Oddly enough it actually takes longer then the initial file
>>> creation.  So I am guessing that the long duration copy of the file
>>> is expected and that is not one of the virtues of btrfs copy on
>>> write.  Does that sound right?
>>
>> You probably want cp --reflink=always, which makes a CoW copy of
>> the file's metadata only. The resulting files have the semantics of
>> two different files, but share their blocks until a part of one of
>> them is modified (at which point, the modified blocks are no longer
>> shared).
>>
>> Hugo.
>>
> I see, and it works great:
> time cp --reflink=always 10G 10G3
>
> real0m0.028s
> user0m0.000s
> sys0m0.000s
>
> So from the user perspective I might say I want to opt out of this feature
> not optin.  I want all copies by all applications done as a copy on write.
> But if my understanding is correct that is up to the application being
> called (in this case cp) and how it in turns makes calls to the system.
>
> In short I can't remount the btrfs filesystem with some new args that says
> always copy on write files because that is what it already.

There's no "copy a file" syscall; when a program copies a file, it
opens a new file, and writes all the bytes from the old to the new.
Converting this to a reflink would require btrfs to implement full
de-dup (which is rather expensive), and still wouldn't prevent the
program from reading and writing all 10gb (and so wouldn't be any
faster).

You can set an alias in your shell to make cp --reflink=auto the
default, but that won't affect other programs, nor other users.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy on write misconception

2013-02-22 Thread cwillu
> Then if I copied this file no blocks would be copied until they are written.
> Hence the two files would use the same blocks underneath. But specifically
> that copy would be fast.  Since it would only need to write some metadata.
> But when I copy the file:
> time cp 10G 10G2

cp without arguments still does a regular copy; btrfs does nothing to
de-duplicate writes.

"cp --reflink 10G 10G2" will give you the results you expect.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Current State of BTRFS

2013-02-08 Thread cwillu
On Fri, Feb 8, 2013 at 4:56 PM, Florian Hofmann
 wrote:
> Oh ... I should have mentioned that btrfs is running on top of LUKS.
>
> 2013/2/8 Florian Hofmann :
>> $ btrfs fi df /
>> Data: total=165.00GB, used=164.19GB
>> System, DUP: total=32.00MB, used=28.00KB
>> System: total=4.00MB, used=0.00
>> Metadata, DUP: total=2.00GB, used=1.40GB
>>
>> $ btrfs fi show
>> failed to read /dev/sr0
>> Label: none  uuid: b4ec0b14-2a42-47e3-a0cd-1257e789ed25
>> Total devices 1 FS bytes used 165.59GB
>> devid1 size 600.35GB used 169.07GB path /dev/dm-0
>>
>> Btrfs Btrfs v0.19
>>
>> ---
>>
>> I just noticed that I can force 'it' by transferring a large file from
>> my NAS. I did the sysrq-trigger thing, but there is no suspicious
>> output in dmesg (http://pastebin.com/swrCdC3U).
>>
>> Anything else?

The pastebin didn't include any output from sysrq-w; even if there's
nothing to report there would still be a dozen lines or so per cpu; at
the absolute minimum there should be a line for each time you ran it:

[4477369.680307] SysRq : Show Blocked State

Note that you need to echo as root, or use the keyboard combo
alt-sysrq-w to trigger.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: System unmountable RW

2013-02-01 Thread cwillu
> then I do : mount -o rw,remount /backup/
>
> Feb  1 22:32:38 frozen kernel: [   65.780686] btrfs: force zlib compression
> Feb  1 22:32:38 frozen kernel: [   65.780700] btrfs: not using ssd allocation 
> scheme
> Feb  1 22:32:38 frozen kernel: [   65.780706] btrfs: disk space caching is 
> enabled
>
>
> I let that mount run days, without any success. It stay running, and I can't 
> interrupt it (CTRL+C or kill).

Hit alt-sysrq-w at that point, and then post your dmesg; there should
be at least one stacktrace in there (possibly many), which should give
a good idea where it's hanging up.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Abort on memory allocation failure

2013-01-25 Thread cwillu
On Fri, Jan 25, 2013 at 4:55 PM, Ian Kumlien  wrote:
> Hi,
>
> Could someone do a sanity check of this, i have removed some of the
> checking code that is no longer needed but i would prefer to have
> reviewers. I haven't looked much at the code, mainly been focusing on
> the grunt work ;)
>
> Anyway, thanks for looking at it!

Include patches inline in your email rather than as an attachment.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs-progs: Exit if not running as root

2013-01-25 Thread cwillu
On Fri, Jan 25, 2013 at 9:04 AM, Gene Czarcinski  wrote:
> OK, I think I have gotten the message that this is a bad idea as implemented
> and that it should be dropped as such.  I believe that there are some things
> ("btrfs fi show" comes to mind) which will need root and I am going to
> explore doing something for that case.  And it also might be reasonable for
> some situations to issue the message about root if something errors-out.

Eh?  That's one of the clearest cases where you _may not_ need root.

cwillu@cwillu-home:~$ groups
cwillu adm dialout cdrom audio video plugdev mlocate lpadmin admin sambashare
cwillu@cwillu-home:~$ btrfs fi show /dev/sda3
failed to read /dev/sda
failed to read /dev/sda1
failed to read /dev/sda2
failed to read /dev/sda3
failed to read /dev/sdb
Btrfs v0.19-152-g1957076

cwillu@cwillu-home:~$ sudo addgroup cwillu disk
cwillu@cwillu-home:~$ su cwillu

cwillu@cwillu-home:~$ groups
cwillu adm disk dialout cdrom audio video plugdev mlocate lpadmin
admin sambashare
cwillu@cwillu-home:~$ btrfs fi show /dev/sda3
Label: none  uuid: ede59711-6230-474f-992d-f1e3deeddab7
Total devices 1 FS bytes used 72.12GB
devid1 size 104.34GB used 104.34GB path /dev/sda3

Btrfs v0.19-152-g1957076
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub questtion

2013-01-15 Thread cwillu
On Tue, Jan 15, 2013 at 8:21 AM, Gene Czarcinski  wrote:
> When you start btrfs scrub and point at one subvolume, what is "scrubbed"?
>
> Just that subvolume or the entire volume?

The entire volume.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: obscure out of space, df and fi df are way off

2013-01-11 Thread cwillu
>>> [root@localhost tmp]# df
>>> Filesystem 1K-blocksUsed Available Use% Mounted on
>>> /dev/sda33746816 3193172  1564 100% /mnt/sysimage
>>> /dev/sda1 495844   31509438735   7% 
>>> /mnt/sysimage/boot
>>> /dev/sda33746816 3193172  1564 100% 
>>> /mnt/sysimage/home
>>>
>>> So there's 1.5M of free space left according to conventional df. However:
>>>
>>> [root@localhost tmp]# btrfs fi show
>>> Label: 'fedora_f18v'  uuid: 0c9b2b62-5ec1-4610-ab2f-9f00c909428a
>>>Total devices 1 FS bytes used 2.87GB
>>>devid1 size 3.57GB used 3.57GB path /dev/sda3
>>>
>>> [root@localhost tmp]# btrfs fi df /mnt/sysimage
>>> Data: total=2.69GB, used=2.69GB
>>> System, DUP: total=8.00MB, used=4.00KB
>>> System: total=4.00MB, used=0.00
>>> Metadata, DUP: total=438.94MB, used=183.36MB
>>> Metadata: total=8.00MB, used=0.00

> So if I assume 2.7GiB for data, and add up the left side of fi df I get 
> 3224MB rounded up, which is neither 3.57GB or 3.57GiB. I'm missing 346MB at 
> least. That is what I should have said from the outset.

2.69 + (438.94 / 1000 *2) + (8.0 / 1000 / 1000 *2) + (4.0 / 1000 /
1000) + (8.0 / 1000 / 1000 *2)
3.567916

Looks like 3.57GB to me :p

> So is the Metadta DUP Total 438.94MB allocated value actually twice that, but 
> only 438.94MB is displayed because that's what's available (since the 
> metadata is duplicated)?

The capacity of the metadata group is 438.94; the actual size on disk
is twice that.

>> Note that the -M option to mkfs.btrfs is intended for this use-case:
>> filesystems where the size of a block allocation is large compared to
>> the size of the filesystem.  It should let you squeeze out most of the
>> rest of that 400MB (200MB, DUP).
>
> Is there a simple rule of thumb an installer could use to know when to use 
> -M? I know mkfs.btrfs will do this for smaller filesystems than this. I'm 
> thinking this is a testing edge case that a desktop installer shouldn't be 
> concerned about, but rather should just gracefully fail from, or better yet, 
> insist on a larger install destination than this in particular when Btrfs.

I tend to go with "any filesystem smaller than 32GB", but a more
accurate rule is probably along the lines of "any filesystem that you
expect to normally run within half a gb of full".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: obscure out of space, df and fi df are way off

2013-01-11 Thread cwillu
On Fri, Jan 11, 2013 at 11:50 PM, Chris Murphy  wrote:
> Very low priority.
> No user data at risk.
> 8GB virtual disk being installed to, and the installer is puking. I'm trying 
> to figure out why.
>
> I first get an rsync error 12, followed by the installer crashing. What's 
> interesting is this, deleting irrelevant source file systems, just showing 
> the mounts for the installed system:
>
> [root@localhost tmp]# df
> Filesystem 1K-blocksUsed Available Use% Mounted on
> /dev/sda33746816 3193172  1564 100% /mnt/sysimage
> /dev/sda1 495844   31509438735   7% /mnt/sysimage/boot
> /dev/sda33746816 3193172  1564 100% /mnt/sysimage/home
>
> So there's 1.5M of free space left according to conventional df. However:
>
> [root@localhost tmp]# btrfs fi show
> Label: 'fedora_f18v'  uuid: 0c9b2b62-5ec1-4610-ab2f-9f00c909428a
> Total devices 1 FS bytes used 2.87GB
> devid1 size 3.57GB used 3.57GB path /dev/sda3
>
> [root@localhost tmp]# btrfs fi df /mnt/sysimage
> Data: total=2.69GB, used=2.69GB
> System, DUP: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=438.94MB, used=183.36MB
> Metadata: total=8.00MB, used=0.00
>
> And absolutely nothing in dmesg.
>
> This is confusing. fi show says 3.57GB available and used. Whereas fi df says 
> 2.69 available and used. So is it 3.57GB? Or is it 2.69? I suppose the simple 
> answer is, it doesn't matter, in either case it's full. But it seems like the 
> installer is underestimating Btrfs requirements and should be more 
> conservative, somehow so I'd like to better understand the allocation.

There reporting is correct, but a bit obscure.  We have a FAQ item on
how to read the output on the wiki, but it's a known sore spot.

btrfs fi show reports 3.57GB allocated to block groups (so everything
is assigned to metadata or data); btrfs fi df reports how that 3.57GB
is being used: of 2.69GB allocated to data block groups, 2.69GB (i.e.,
all of it) is in use by file data; of 438.94MB of metadata (or 0.87GB
after DUP), 183.36MB is in use by metadata (which may include small
files that have been inlined).

In other words, the tools are saying that filesystem is basically full. :)

Note that the -M option to mkfs.btrfs is intended for this use-case:
filesystems where the size of a block allocation is large compared to
the size of the filesystem.  It should let you squeeze out most of the
rest of that 400MB (200MB, DUP).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Option LABEL

2013-01-03 Thread cwillu
On Thu, Jan 3, 2013 at 11:57 AM, Helmut Hullen  wrote:
> But other filesystems don't put the label onto more than 1 device.
> There's the problem for/with btrfs.

Other filesystems don't exist on more than one device, so of course
they don't put a label on more than one device.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: parent transid verify failed on -- After moving btrfs closer to the beginning of drive with dd

2012-12-29 Thread cwillu
On Sat, Dec 29, 2012 at 7:14 AM, Jordan Windsor  wrote:
> Also here's the output of btrfs-find-root:
>
> ./btrfs-find-root /dev/sdb1
> Super think's the tree root is at 1229060866048, chunk root 1259695439872
> Went past the fs size, exiting
>
> Not sure where to go from here.

I can't say for certain, but that suggests that the move-via-dd didn't
succeed / wasn't correct, and/or the partitioning changes didn't
match, and/or the dd happened from a mounted filesystem (which would
also explain the transid errors, if there wasn't an unclean umount
involved).

btrfs-restore might be able to pick out files, but you may be in
restore-from-backup territory.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: parent transid verify failed on -- After moving btrfs closer to the beginning of drive with dd

2012-12-28 Thread cwillu
On Fri, Dec 28, 2012 at 12:09 PM, Jordan Windsor  wrote:
> Hello,
> I moved my btrfs to the beginning of my drive & updated the partition
> table & also restarted, I'm currently unable to mount it, here's the
> output in dmesg.
>
> [  481.513432] device label Storage devid 1 transid 116023 /dev/sdb1
> [  481.514277] btrfs: disk space caching is enabled
> [  481.522611] parent transid verify failed on 1229060423680 wanted
> 116023 found 116027
> [  481.522789] parent transid verify failed on 1229060423680 wanted
> 116023 found 116027
> [  481.522790] btrfs: failed to read tree root on sdb1
> [  481.523656] btrfs: open_ctree failed
>
> What command should I run from here?

The filesystem wasn't uncleanly unmounted, likely on an older kernel.

Try mounting with -o recovery
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: HIT WARN_ON WARNING: at fs/btrfs/extent-tree.c:6339 btrfs_alloc_free_block+0x126/0x330 [btrfs]()

2012-12-19 Thread cwillu
On Wed, Dec 19, 2012 at 9:12 AM, Rock Lee  wrote:
> Hi all,
>
> Did someone have met this problem before. When doing the tests, I hit
>
> the WARN_ON. Is this log make sense or someone had fixed the problem.
>
>  If needed, I can supply the detail log and the testcase source file.

That'd be good, as well as the specific kernel version.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs subvolume snapshot performance problem

2012-12-18 Thread cwillu
On Tue, Dec 18, 2012 at 7:06 AM, Sylvain Alain  wrote:
> So, if I don't use the discard command, how often do I need to run the
> fstrim command ?

If your ssd isn't a pile of crap, never.  SSD's are always
over-provisioned, and so every time an erase block fills up, the drive
knows that there must be one erase-block worth of garbage which could
be compacted, erased, and added to the pool of empty blocks.  The
crappiest ones only do this as needed (which is why their write speed
plummets with use), and really benefit from people forcing the issue
with -o discard or occasional fstrim.  Everything else should get
along fine without it, although an occasional fstrim certainly won't
hurt: it just shouldn't help much.

> I found this thread : https://patrick-nagel.net/blog/archives/337

It's worth noting that there's a large number of very effective tricks
that an ssd can perform to almost completely negate the caveat
mentioned there.  It really is a solved problem in a modern ssd.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unmountable partition and live distro (no space left)

2012-12-16 Thread cwillu
Try booting with bootflags=ro,recovery in grub (with the latest
possible kernel), or mounting with -o recovery from the livecd
(likewise).  If it works, then you're done, you should be able to boot
normally after a clean umount and shutdown.  If it doesn't, post dmesg
from the attempt.

> I'v been told this is missing relevant details.
> The original kernel version was 3.2.0-something (standard Ubuntu 12.04 LTS).
> I've since upgraded to 3.7 but this has made no difference.
> Right now I don't have the dmesg, I'll post it later.
>
> Currently I've been able to mount the partition with btrfs-restore and am
> trying to rsync it on another ext4 volume.

Terminology note: btrfs-restore doesn't "mount" anything, it just
copies files directly from a device.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Intel 120G SSD write performance with 3.2.0-4-amd64

2012-12-15 Thread cwillu
On Sat, Dec 15, 2012 at 5:23 PM, Russell Coker
 wrote:
> I've got a system running Debian kernel 3.2.0-4-amd64 with root on a SSD that
> identifies itself as "INTEL SSDSC2CT12 300i" (it's an Intel 120G device).

3.2 is massively old in btrfs terms, with lots of fun little stability
and performance bugs.

> Here is the /proc/mounts entry which shows that ssd and discard options are
> enabled.
>
> /dev/disk/by-uuid/7939c405-c656-4e85-a6a0-29f17be09585 / btrfs
> rw,seclabel,nodev,noatime,ssd,discard,space_cache 0 0

Don't use discard; it's a non-queuing command, which means your
performance will suck unless your device is _really_ terrible at
garbage collection (in which case, it's just the lesser of two evils).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Encryption

2012-12-12 Thread cwillu
On Wed, Dec 12, 2012 at 2:06 PM,   wrote:
> On Wed, Dec 12, 2012, at 10:48, cwillu wrote:
>> Sayeth the FAQ:
>
> Oh pardon me, it's BTRFS RAID that's a no-go, which is just as critical
> to me as I have a 4 disk 8TB array.
> The FAQ goeth on to Say:
> ---
> This pretty much forbids you to use btrfs' cool RAID features if you
> need encryption. Using a RAID implementation on top of several encrypted
> disks is much slower than using encryption on top of a RAID device. So
> the RAID implementation must be on a lower layer than the encryption,
> which is not possible using btrfs' RAID support.
>  ---
>
> You saw that I need RAID above.  Were you just trying to criticize my
> memory of the FAQ cwillu?

It's not asking for trouble, it's just asking for poor performance,
and I suspect even that will depend greatly on the workload.

Snapshots still have nothing to do with it:  you could have btrfs
(with snapshots) on dm-crypt on mdraid.  Btrfs would just lose the
ability to try alternate mirrors and similar; snapshots would still
work just fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Encryption

2012-12-12 Thread cwillu
On Wed, Dec 12, 2012 at 12:38 PM,   wrote:
>
> On Wed, Dec 12, 2012, at 10:31, Mitch Harder wrote:
>> I run btrfs on top of LUKS encryption on my laptop.  You should be able to 
>> do the same.
>>
>> You could then run rsync through ssh.  However, rsync will have no knowledge 
>> of any blocks shared under subvolume snapshots.
>>
>> Btrfs does not yet have internal encryption.

> The FAQ says specifically to NOT run BTRFS with any kind of volume
> encryption, so you're asking for trouble.

Sayeth the FAQ:

Does Btrfs work on top of dm-crypt?
This is deemed safe since 3.2 kernels. Corruption has been reported
before that, so you want a recent kernel. The reason was improper
passing of device barriers that are a requirement of the filesystem to
guarantee consistency.

> And clearly encryption is not possible if you need snapshots.

Snapshots don't come into this at all:  btrfs doesn't care where the
block devices it's on come from.  Things like dm-crypt show btrfs (or
whatever filesystem you put on it) a decrypted view of the device.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount luks partition after reboot

2012-12-03 Thread cwillu
On Mon, Dec 3, 2012 at 7:22 PM, Travis LaDuke  wrote:
> This is kind of silly, but may be salvageable...
> I made a btrfs on top of luks partition and tried it for a couple days. Then
> I made another luks partition on another drive then added and balanced that
> new drive as btrfs raid1.  A lot of time passed and the balance finished.
>
> Then I rebooted. The original partition will luksOpen, but btrfs won't mount
> it. The 2nd one is in worse shape, it won't even luksOpen.
> I haven't tried btrfsck yet. Is there something else I should try first?
> What debug info can I post?
>
> halp

"Help" ;p

Try mounting the original with -o degraded, and post the dmesg of the
attempt if it doesn't work.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: High-sensitivity fs checker (not repairer) for btrfs

2012-11-10 Thread cwillu
On Sat, Nov 10, 2012 at 4:32 PM, Bob Marley  wrote:
> On 11/10/12 22:23, Hugo Mills wrote:
>>
>> The closest thing is btrfsck. That's about as picky as we've got to
>> date.
>>
>> What exactly is your use-case for this requirement?
>
>
> We need a decently-available system. We can rollback filesystem to
> last-known-good if the "test" detects an inconsistency on current btrfs
> filesystem, but we need a very good test for that (i.e. if last-known-good
> is actually bad we get into serious troubles).

Scrub is probably more useful as a check, combined with "does the
filesystem actually mount".

> So do you think btrfsck can return a false "OK" result? can it "not-see" an
> inconsistency?

No set of checks will ever be perfect, so yes.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: (late) REQUEST: Default mkfs.btrfs block size

2012-11-05 Thread cwillu
On Mon, Nov 5, 2012 at 10:06 AM, David Sterba  wrote:
> On Wed, Oct 31, 2012 at 12:20:39PM +, Alex wrote:
>> As one 'stuck' with 4k leaves on my main machine for the moment, can I 
>> request
>> the btrfs-progs v0.20 defaults to more efficient decent block sizes before
>> release. Most distro install programs for the moment don't give access to the
>> options at install time and there seems to be is a significant advantage to 
>> 16k
>> or 32k
>
> IMHO this should be fixed inside the installer, changing defaults for a
> core utility will affect everybody. 4k is the most tested option and
> thus can be considered "safe for everybody".
>
> The installer may let you to enter a shell and create the filesystem by
> hand, then point it to use it for installation.

If we know a better setting, we should default to it.  Punting the
decision to the distro just means I'll spend the next 3 years telling
people "yeah, distro X doesn't set it to the recommended setting
(which isn't the mkfs default), and there's no way to change it
without wiping and reinstalling using manual partitioning blah blah
blah."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][BTRFS-PROGS] Enhance btrfs fi df

2012-11-03 Thread cwillu
> do you have more information about raid ? When it will land on the btrfs
> earth ? :-)

An unnamed source recently said "today I'm fixing parity rebuild in
the middle of a read/modify/write. its one of my last blockers", at
which point several gags about progress meters were made.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's the minimum size I can shrink my FS to?

2012-11-02 Thread cwillu
Run "btrfs balance start -musage=1 -dusage=1", and then try it again.
This may require update btrfs tools however.

On Fri, Nov 2, 2012 at 10:09 PM, Jordan Windsor  wrote:
> Hello,
> I'm trying to shrink my Btrfs filesystem to the smallest size it can
> go, here's the information:
>
> failed to read /dev/sr0
> Label: 'Storage'  uuid: 717d4a43-38b3-495f-841b-d223068584de
> Total devices 1 FS bytes used 491.86GB
> devid1 size 612.04GB used 605.98GB path /dev/sda6
>
> Btrfs Btrfs v0.19
>
> Data: total=580.90GB, used=490.88GB
> System, DUP: total=32.00MB, used=76.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=12.51GB, used=1001.61MB
>
> Here's the command I use to resize:
>
> [root@archpc ~]# btrfs file res 500g /home/jordan/Storage/
> Resize '/home/jordan/Storage/' of '500g'
> ERROR: unable to resize '/home/jordan/Storage/' - No space left on device
>
> I was wondering if that size doesn't work then what's the minimum I
> can shrink to?
>
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Request for review] [RFC] Add label support for snapshots and subvols

2012-11-01 Thread cwillu
> Below is a demo of this new feature.
> 
>  btrfs fi label -t /btrfs/sv1 "Prod-DB"
>
>  btrfs fi label -t /btrfs/sv1
> Prod-DB
>
>  btrfs su snap /btrfs/sv1 /btrfs/snap1-sv1
> Create a snapshot of '/btrfs/sv1' in '/btrfs/snap1-sv1'
>  btrfs fi label -t /btrfs/snap1-sv1
>
>  btrfs fi label -t /btrfs/snap1-sv1 "Prod-DB-sand-box-testing"
>
>  btrfs fi label -t /btrfs/snap1-sv1
> Prod-DB-sand-box-testing

Why is this better than:

# btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing
# mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test
# ls /btrfs/
Prod-DB  Prod-DB-production-test
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why btrfs inline small file by default?

2012-10-31 Thread cwillu
On Wed, Oct 31, 2012 at 4:48 AM, Ahmet Inan
 wrote:
>>> i also dont see any benefit from inlining small files:
>
>>> with defaults (inlining small files):
>>> real4m39.253s
>>> Data: total=10.01GB, used=9.08GB
>>> Metadata, DUP: total=2.00GB, used=992.48MB
>
>>> without inline:
>>> real4m42.085s
>>> Data: total=11.01GB, used=10.85GB
>>> Metadata, DUP: total=1.00GB, used=518.59MB
>>
>> I suggest you take a closer look at your numbers.
>
> both use 12GiB in total and both need 280 seconds.
> am i missing something?

9.08GB + 992.48MB*2 == 11.02GB

10.85GB + 518MB*2 == 11.86GB

That's nearly a GB smaller.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why btrfs inline small file by default?

2012-10-31 Thread cwillu
On Wed, Oct 31, 2012 at 2:48 AM, Ahmet Inan
 wrote:
> i also dont see any benefit from inlining small files:

> with defaults (inlining small files):
> real4m39.253s
> Data: total=10.01GB, used=9.08GB
> Metadata, DUP: total=2.00GB, used=992.48MB

> without inline:
> real4m42.085s
> Data: total=11.01GB, used=10.85GB
> Metadata, DUP: total=1.00GB, used=518.59MB

I suggest you take a closer look at your numbers.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why btrfs inline small file by default?

2012-10-30 Thread cwillu
On Tue, Oct 30, 2012 at 5:47 PM, ching  wrote:
> On 10/31/2012 06:19 AM, Hugo Mills wrote:
>> On Tue, Oct 30, 2012 at 10:14:12PM +, Hugo Mills wrote:
>>> On Wed, Oct 31, 2012 at 05:40:25AM +0800, ching wrote:
>>>> On 10/30/2012 08:17 PM, cwillu wrote:
>>>>>>> If there is a lot of small files, then the size of metadata will be
>>>>>>> undesirable due to deduplication
>>>>>> Yes, that is a fact, but if that really matters depends on the use-case
>>>>>> (e.g., the small files to large files ratio, ...). But as btrfs is 
>>>>>> designed
>>>>>> explicitly as a general purpose file system, you usually want the good
>>>>>> performance instead of the better disk-usage (especially as disk space 
>>>>>> isn't
>>>>>> expensive anymore).
>>>>> As I understand it, in basically all cases the total storage used by
>>>>> inlining will be _smaller_, as the allocation doesn't need to be
>>>>> aligned to the sector size.
>>>>>
>>>> if i have 10G small files in total, then it will consume 20G by default.
>>>If those small files are each 128 bytes in size, then you have
>>> approximately 80 million of them, and they'd take up 80 million pages,
>>> or 320 GiB of total disk space.
>>Sorry, to make that clear -- I meant if they were stored in Data.
>> If they're inlined in metadata, then they'll take approximately 20 GiB
>> as you claim, which is a lot less than the 320 GiB they'd be if
>> they're not.
>>
>>Hugo.
>>
>
>
> is it the same for:
> 1. 3k per file with leaf size=4K
> 2. 60k per file with leaf size=64k
>
>

import os
import sys

data = "1" * 1024 * 3

for x in xrange(100 * 1000):
  with open('%s/%s' % (sys.argv[1], x), 'a') as f:
f.write(data)

root@repository:~$ mount -o loop ~/inline /mnt
root@repository:~$ mount -o loop,max_inline=0 ~/noninline /mnt2

root@repository:~$ time python test.py /mnt
real0m11.105s
user0m1.328s
sys 0m5.416s
root@repository:~$ time python test.py /mnt2
real0m21.905s
user0m1.292s
sys 0m5.460s

root@repository:/$ btrfs fi df /mnt
Data: total=1.01GB, used=256.00KB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=652.70MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btrfs fi df /mnt2
Data: total=1.01GB, used=391.12MB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=60.98MB
Metadata: total=8.00MB, used=0.00

3k data, 4k leaf: inline is twice the speed, but 1.4x bigger.



root@repository:~$ mkfs.btrfs inline -l 64k
root@repository:~$ mkfs.btrfs noninline -l 64k
...
root@repository:~$ time python test.py /mnt
real0m12.244s
user0m1.396s
sys 0m8.101s
root@repository:~$ time python test.py /mnt2
real0m13.047s
user0m1.436s
sys 0m7.772s

root@repository:/$ btr\fs fi df /mnt
Data: total=8.00MB, used=256.00KB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=342.06MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btr\fs fi df /mnt2
Data: total=1.01GB, used=391.10MB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=50.06MB
Metadata: total=8.00MB, used=0.00

3k data, 64k leaf: inline is still 10% faster, and is now 25% smaller



data = "1" * 1024 * 32

... (mkfs, mount, etc)

root@repository:~$ time python test.py /mnt
real0m17.834s
user0m1.224s
sys 0m4.772s
root@repository:~$ time python test.py /mnt2
real0m20.521s
user0m1.304s
sys 0m6.344s

root@repository:/$ btrfs fi df /mnt
Data: total=4.01GB, used=3.05GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=54.00MB
Metadata: total=8.00MB, used=0.00

root@repository:/$ btrfs fi df /mnt2
Data: total=4.01GB, used=3.05GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=53.56MB
Metadata: total=8.00MB, used=0.00

32k data, 64k leaf: inline is still 10% faster, and is now the same
size (not dead sure why, probably some interaction with the size of
the actual write that happens)



data = "1" * 1024 * 7

... etc


root@repository:~$ time python test.py /mnt
real0m9.628s
user0m1.368s
sys 0m4.188s
root@repository:~$ time python test.py /mnt2
real0m13.455s
user0m1.608s
sys 0m7.884s

root@repository:/$ btrfs fi df /mnt
Data: total=3.01GB, used=1.91GB
System, DUP: total=8.00MB, used=64.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=74

Re: Why btrfs inline small file by default?

2012-10-30 Thread cwillu
On Tue, Oct 30, 2012 at 3:40 PM, ching  wrote:
> On 10/30/2012 08:17 PM, cwillu wrote:
>>>> If there is a lot of small files, then the size of metadata will be
>>>> undesirable due to deduplication
>>>
>>> Yes, that is a fact, but if that really matters depends on the use-case
>>> (e.g., the small files to large files ratio, ...). But as btrfs is designed
>>> explicitly as a general purpose file system, you usually want the good
>>> performance instead of the better disk-usage (especially as disk space isn't
>>> expensive anymore).
>> As I understand it, in basically all cases the total storage used by
>> inlining will be _smaller_, as the allocation doesn't need to be
>> aligned to the sector size.
>>
>
> if i have 10G small files in total, then it will consume 20G by default.
>
> ching

No.  No they will not.  As I already explained.

root@repository:/mnt$ mount ~/inline /mnt -o loop
root@repository:/mnt$ mount ~/inline /mnt2 -o loop,max_inline=0

root@repository:/mnt$ mount
/dev/loop0 on /mnt type btrfs (rw)
/dev/loop1 on /mnt2 type btrfs (rw,max_inline=0)

root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff"
> /mnt/$x; done

real1m5.447s
user0m38.422s
sys 0m18.493s

root@repository:/mnt$ time for x in {1..243854}; do echo "some stuff"
> /mnt2/$x; done

real1m49.880s
user0m40.379s
sys 0m26.210s

root@repository:/mnt$ df /mnt /mnt2
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/loop010485760266952   8359680   4% /mnt
/dev/loop110485760   1311620   7384236  16% /mnt2

root@repository:/mnt$ btrfs fi df /mnt
Data: total=1.01GB, used=256.00KB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=130.22MB
Metadata: total=8.00MB, used=0.00

root@repository:/mnt$ btrfs fi df /mnt2
Data: total=2.01GB, used=953.05MB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=164.03MB
Metadata: total=8.00MB, used=0.00

root@repository:/mnt$ btrfs fi show
Label: none  uuid: e5440337-9f44-4b2d-9889-80ab0ab8f245
Total devices 1 FS bytes used 130.47MB
devid1 size 10.00GB used 3.04GB path /dev/loop0

Label: none  uuid: cfcc4149-3102-465d-89b8-0a6bb6a4749a
Total devices 1 FS bytes used 1.09GB
devid1 size 10.00GB used 4.04GB path /dev/loop1

Btrfs Btrfs v0.19

Any questions?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why btrfs inline small file by default?

2012-10-30 Thread cwillu
>> If there is a lot of small files, then the size of metadata will be
>> undesirable due to deduplication
>
>
> Yes, that is a fact, but if that really matters depends on the use-case
> (e.g., the small files to large files ratio, ...). But as btrfs is designed
> explicitly as a general purpose file system, you usually want the good
> performance instead of the better disk-usage (especially as disk space isn't
> expensive anymore).

As I understand it, in basically all cases the total storage used by
inlining will be _smaller_, as the allocation doesn't need to be
aligned to the sector size.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs defrag problem

2012-10-30 Thread cwillu
On Tue, Oct 30, 2012 at 5:47 AM, ching  wrote:
> Hi all,
>
> I try to defrag my btrfs root partition (run by root privilege)
>
> find / -type f -o -type d -print0 | xargs --null --no-run-if-empty btrfs 
> filesystem defragment -t $((32*1024*1024))
>
>
> 1. This kind of error messages is prompted:
>
> failed to open /bin/bash
> open:: Text file busy
> total 1 failures
> failed to open /lib64/ld-2.15.so
> open:: Text file busy
> total 1 failures
> failed to open /sbin/agetty
> open:: Text file busy
> failed to open /sbin/btrfs
> open:: Text file busy
> failed to open /sbin/dhclient
> open:: Text file busy
> failed to open /sbin/init
> open:: Text file busy
> failed to open /sbin/udevd
>
> It seems that locked files cannot be defragged, is it expected behaviour?

I can't reproduce that behaviour here, although maybe you're running
an older kernel with some bug that's since been fixed?

> 2. Btrfs Wiki mentions that defrag directory will defrag metadata, is 
> symlink/hardlink considered as metadata?
>
> P.S. inline data is already disabled by "max_inline=0"

Well, that's a silly thing to do, causing every small file to take up
a separate 4kb block rather than its size * 2, and requiring extra
seeks to read/write them (i.e., if you have a million 10 byte files,
they'll now take up 4GB instead of 20MB).

> 3. Is any possible to online defrag a btrfs partition without hindered by 
> mount point/polyinstantied directories?

If you're asking if you can defrag an unmounted btrfs, not at this
time.  It's possible in principle, nobody has cared enough to
implement it yet.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Naming of subvolumes

2012-10-26 Thread cwillu
On Fri, Oct 26, 2012 at 9:54 AM, Chris Murphy  wrote:
>
> On Oct 26, 2012, at 2:27 AM, Richard Hughes  wrote:
>
>
>>> And if you're going to apply the upgrade to the snapshot, or to the top 
>>> level file system?
>>
>> That's a very good question. I was going to apply the upgrade to the
>> top level file system, and then roll-back to the old snapshot if the
>> new upgrade state does not boot to a GUI. It would work equally well
>> applying the update to the snapshot and then switching to that,
>> although I don't know how well userspace is going to cope switching to
>> a new subvolume at early-runtime given the fact we're running the
>> upgrade itself from the filesystem and not in a special initrd or
>> something. Also, we need to ready out lots of upgrade data (200Mb+?)
>> from somewhere, so doing this on the snapshot would mean creating the
>> snapshot and copying the data there *before* we reboot to install
>> upgrades, and we really want to create the snapshot when we're sure
>> the system will boot.
>
> a. Upgrade top level:
> Maybe download upgrade data to /tmp, upon completion snapshot the current 
> system state, then copy the upgrade data to persistent storage and reboot; 
> upgrade of top level boot root begins. The snapshot is the regressive state 
> in case of failure.
>
> b. Upgrade snapshot:
> Create snapshot, mount it somewhere; download upgrade data to a location 
> within the snapshot; reboot from the snapshot, it upgrades itself and cleans 
> up. The top level is the regressive state in case of failure.
>
> Either way, 200MB of downloaded (and undeleted) upgrade data isn't stuck in a 
> snapshot after it's used. And either way the snapshot is bootable.
>
> If you get sufficient metadata in the snapshot, then you can name/rename the 
> snapshots whatever you want. I'd also point out it's valid for the user to 
> prefer a different organization, i.e. instead of Fedora taking over the top 
> level of a btrfs volume, to create subvolumes Fedora 17, Fedora 18, Ubuntu 
> 12X, etc., at the top level, and insert boot and root and possibly home in 
> those. In which case the upgrade mechanism should still work.
>
>>
>>> So I'm going to guess that you will actually create a subvolume named 
>>> something like @system-upgrade-20121025, and then snapshot root, boot, and 
>>> home into that subvol?
>>
>> Not /home. Packages shouldn't be installing stuff there anyway, and
>> /home is going to typically much bigger than /root or /boot.
>
> OK so small problem here is that today /etc/fstab is pointing to the home 
> subvolume in a relative location to the default subvolume. The fstab mount 
> option is subvol=home, not subvol=/home, not subvolid=xxx.
>
> So if you want to use changing default subvolumes to make the switch between 
> the current updated state, and rollback states, (which is milliseconds fast), 
> which also means no changes needed to grub's core.img or grub.cfg (since 
> those use relative references for boot and root), a change is needed for home 
> to use an absolute reference: either subvol=/home or use subvolid= in the 
> fstab.
>
> While a bit more obscure in the /etc/fstab the subvolid= is more reliable. 
> That home can be renamed or moved anywhere and it'll still work. I think it's 
> legitimate for a user to create or want
>
>
>>> If the upgrade is not successful, you change the default subvolume ID to 
>>> that of @system-upgrade-20121025.
>>
>> I was actually thinking of using btrfs send | btrfs receive to roll
>> back the root manually. It would be better if btrfs could swap the
>> subvolume ID's of @system-upgrade-20121025 and 0, as then we don't get
>> a snapshot that's useless.
>
> I haven't tried btrfs send/receive for this purpose, so I can't compare. But 
> btrfs subvolume set-default is faster than the release of my finger from the 
> return key. And it's easy enough the user could do it themselves if they had 
> reasons for regression to a snapshot that differ than the automagic 
> determination of the upgrade pass/fail.
>
> The one needed change, however, is to get /etc/fstab to use an absolute 
> reference for home.
>
>
> Chris Murphy
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

I'd argue that everything should be absolute references to subvolumes
(/@home, /@, etc), and neither set-default nor subvolume id's should
be touched.  There's no need, as you can simply mv those around (even
while mounted).  More importantly, it doesn't result in a case where
the fstab in one snapshot points its mountpoint to a different
snapshot, with all the hilarity that would cause over time, and also
allows multiple distros to be installed on the same filesystem without
having them stomp on each others set-defaults: /@fedora, /@rawhide,
/@ubuntu, /@home, etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-b

Re: [RFC] New attempt to a better "btrfs fi df"

2012-10-25 Thread cwillu
On Thu, Oct 25, 2012 at 8:33 PM, Chris Murphy  wrote:
> So what's the intended distinction between 'fi df' and 'fi show'? Because for 
> months using btrfs I'd constantly be confused which command was going to show 
> me what information I wanted, and that tells me there should be some better 
> distinction between the commands.

Or the distinction should be removed, which is what this patch effectively does.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] New attempt to a better "btrfs fi df"

2012-10-25 Thread cwillu
On Thu, Oct 25, 2012 at 2:36 PM, Chris Murphy  wrote:
> My suggestion is that by default a summary similar to the existing df command 
> be mimicked, where it makes sense, for btrfs fi df.
>
>  - I like the Capacity %. If there is a reliable equivalent, it need not be 
> inode based, that would be great.
>
> -  I care far less about the actual physical device information, more about 
> the btrfs volume(s) as a whole. How big is the volume, how much of that is 
> used, and how much is available?
>
> I understand the challenges behind estimating the amount available. So if the 
> value for available/free is precided by a ~ in each case, or as a heading 
> "~Avail" or "~Free" I'd be OK with that disclaimer.
>
> I think the examples so far are reporting too much information and it's 
> difficult to get just what I want.

Plain old "/bin/df" is adequate for that though, and in the mean time
one _does_ need _all_ of that information to work with the filesystem.
 However, the detailed breakdown is vital to answer many questions:

"Why can't I write to my filesystem with 80gb free?  
Oh, because metadata is raid1 and the one disk is 80gb smaller than
the other."

"How much data is on this disk that started giving SMART errors?"

"How many GB of vm image files (or other large files) can I probably
fit on this fs?"

"How many GB of mail (or other tiny files) can I probably fit on this fs?"

"Is there enough space to remove this disk from the fs, and how much
free space will I have then?"

And the all-important "Could you please run btrfs fi df and pastebin
the output so we can tell what the hell is going on?" :)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] New attempt to a better "btrfs fi df"

2012-10-25 Thread cwillu
On Thu, Oct 25, 2012 at 2:03 PM, Chris Murphy  wrote:
>
> On Oct 25, 2012, at 1:21 PM, Goffredo Baroncelli  wrote:
>>
>> Moreover I still didn't understand how btrfs was using the disks.
>
> This comment has less to do with the RFC, and more about user confusion in a 
> specific case of the existing fi df behavior. But since I have the same 
> misunderstanding of how btrfs is using the disks, I decided to reply to this 
> thread.
>
> While working with Fedora 18's new System Storage Manager [1], I came across 
> this problem. For reference the bug report [2] which seems less of a bug with 
> ssm than a peculiarity with btrfs chunk allocation and how fi df report usage.
>
> 80GB VDI, Virtual Box VM, containing Fedora 18: installed and yum updated 2-3 
> times. That's it, yet for some reason, 76 GB of chunks have been allocated 
> and they're all full? This doesn't make sense when there's just under 4GB  of 
> data on this single device.
>
> [root@f18v ~]# btrfs fi show
> Label: 'fedora'  uuid: 780b8553-4097-4136-92a4-c6fd48779b0c
> Total devices 1 FS bytes used 3.93GB
> devid1 size 76.06GB used 76.06GB path /dev/sda1
>
> [root@f18v ~]# btrfs fi df /
> Data: total=72.03GB, used=3.67GB
> System, DUP: total=8.00MB, used=16.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=2.00GB, used=257.54MB
> Metadata: total=8.00MB, used=0.00
>
> I decided to rebalance, and while things become a lot more sensible, I'm 
> still confused:
>
> [chris@f18v ~]$ sudo btrfs fi show
> failed to read /dev/sr0
> Label: 'fedora'  uuid: 780b8553-4097-4136-92a4-c6fd48779b0c
> Total devices 1 FS bytes used 3.91GB
> devid1 size 76.06GB used 9.13GB path /dev/sda1
>
> [chris@f18v ~]$ sudo btrfs fi df /
> Data: total=5.00GB, used=3.66GB
> System, DUP: total=64.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, DUP: total=2.00GB, used=256.84MB
>
> Points of confusion:
>
> 1. Why is FS bytes used = 3.91GB, yet devid 1 used is 9.13 GB?

"FS bytes" is what du -sh would show.  "devid 1 used" is space
allocated to some block group (without that block group itself being
entirely used)

> 2. Why before a rebalance does 'fi df' show extra lines, and then after 
> rebalance there are fewer lines? Another case with raid10, 'fi df' shows six 
> lines of data, but then after rebalance is shows three lines?

A bug in mkfs causes some tiny blockgroups with the wrong profile to
be created; as they're unused, they get cleaned up by the balance.

> 3. How does Data: total=72GB before rebalance, but is 5GB after rebalance? 
> This was a brand new file system, file system installed, with maybe 2-3 
> updates, and a dozen or two reboots. That's it. No VM's created on that 
> volume (it's a VDI itself), and the VDI file itself never grew beyond 9GB.

Combine the previous two answers: You had 72GB allocated to block
groups which are mostly empty.  After the balance, the contents of
those groups have been shuffled around such that most of them could be
freed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] New attempt to a better "btrfs fi df"

2012-10-25 Thread cwillu
>>> Allocated_area:
>>>   Data,RAID0: Size:921.75MB, Used:256.00KB
>>>  /dev/vdc 307.25MB
>>>  /dev/vdb 307.25MB
>>>  /dev/vdd 307.25MB
>>>
>>>   Data,Single: Size:8.00MB, Used:0.00
>>>  /dev/vdb   8.00MB
>>>
>>>   System,RAID1: Size:8.00MB, Used:4.00KB
>>>  /dev/vdd   8.00MB
>>>  /dev/vdc   8.00MB
>>>
>>>   System,Single: Size:4.00MB, Used:0.00
>>>  /dev/vdb   4.00MB
>>>
>>>   Metadata,RAID1: Size:460.94MB, Used:24.00KB
>>>  /dev/vdb 460.94MB
>>>  /dev/vdd 460.94MB
>>>
>>>   Metadata,Single: Size:8.00MB, Used:0.00
>>>  /dev/vdb   8.00MB
>>>
>>>   Unused:
>>>  /dev/vdb   2.23GB
>>>  /dev/vdc   2.69GB
>>>  /dev/vdd   2.24GB
>>
>> Couple minor things, in order of personal opinion of severity:
>>
>> * Devices should be listed in a consistent order; device names are
>> just too consistently similar
> Could you elaborate ? I didn't understood well

Thereby demonstrating the problem :)

Data,RAID0: Size:921.75MB, Used:256.00KB
   /dev/vdc 307.25MB
   /dev/vdb 307.25MB
   /dev/vdd 307.25MB
Unused:
   /dev/vdb   2.23GB
   /dev/vdc   2.69GB
   /dev/vdd   2.24GB

The first goes c, b, d; the second goes b, c, d.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help mounting laptop corrupted root btrfs. Kernel BUG at fs/btrfs/volumes.c:3707

2012-10-25 Thread cwillu
On Thu, Oct 25, 2012 at 1:58 PM, Marc MERLIN  wrote:
> Howdy,
>
> I can wait a day or maybe 2 before I have to wipe and restore from backup.
> Please let me know if you have a patch against 3.6.3 you'd like me to try
> to mount/recover this filesystem, or whether you'd like me to try btrfsck.
>
>
> My laptop had a problem with its boot drive which prevented linux
> from writing to it, and in turn caused btrfs to have incomplete writes
> to it.
> After reboot, the boot drive was fine, but the btrfs filesystem has
> a corruption that prevents it from being mounted.
>
> Unfortunately the mount crash prevents writing of crash data to even another
> drive since linux stops before the crash data can be written to syslog.
>
> Picture #1 shows a dump when my laptop crashed (before reboot).
> btrfs no csum found for inode X start Y
> http://marc.merlins.org/tmp/crash.jpg
>
> Mounting with 3.5.0 and 3.6.3 gives the same error:
>
> gandalfthegreat:~# mount -o recovery,skip_balance,ro /dev/mapper/bootdsk
>
> shows
> btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> btrfs: bdev /dev/mapper/bootdsk errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
> (there are 2 lines, not sure why)
>
> kernel BUG at fs/btrfs/volumes.c:3707
> int btrfs_num_copies(struct btrfs_mapping_tree *map_tree, u64 logical, u64 
> len)
> {
> struct extent_map *em;
> struct map_lookup *map;
> struct extent_map_tree *em_tree = &map_tree->map_tree;
> int ret;
>
> read_lock(&em_tree->lock);
> em = lookup_extent_mapping(em_tree, logical, len);
> read_unlock(&em_tree->lock);
> BUG_ON(!em);  <---
>
> If the snapshot helps (sorry, hard to read, but usable):
> http://marc.merlins.org/tmp/btrfs_bug.jpg
>
> Questions:
> 1) Any better way to get a proper dump without serial console?
> (I hate to give you pictures)
>
> 2) Should I try btrfsck now, or are there other mount options than
> mount -o recovery,skip_balance,ro /dev/mapper/bootdsk
> I should try?
>
> 3) Want me to try btrfsck although it may make it impossible for me to
> reproduce the bug and test a fix, as well as potentially break the filesystem
> more (last time I tried btrfsck, it outputted thousands of lines and never 
> converged
> to a state it was happy with)

This looks like something btrfs-zero-log would work around (although
-o recovery should do mostly the same things).  That would destroy the
evidence though, and may just make things (slightly) worse, so I'd
wait to see if anyone suggests something better before trying it.  If
you're ultimately ending up restoring from backup though, it may save
you that effort at least.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] New attempt to a better "btrfs fi df"

2012-10-25 Thread cwillu
> I don't publish the patched because aren't in a good shape. However I
> really like the output. The example is a filesystem based on three
> disks of 3GB.
>
> It is clear that:
> - - RAID0 uses all the disks
> - - RAID1 uses two different disks
>
> Comments are welcome.
>
> Known bugs:
> - - if a filesystem uses a disk but there is any chunk, the disk is not
> shown (solvable)
> - - this command need root capability (I use the BTRFS_IOC_TREE_SEARCH
> to get the chunk info; so that is unavoidable)
>
>
> ghigo@emulato:~$ sudo ./btrfs fi df /mnt/btrfs1/
> [sudo] password for ghigo:
> Path: /mnt/btrfs1
> Summary:
>   Disk_size:   9.00GB
>   Disk_allocated:  1.83GB
>   Disk_unallocated:7.17GB
>   Used:  284.00KB
>   Free_(Estimated):6.76GB   (Max: 8.54GB, min: 4.96GB)
>   Data_to_disk_ratio:75 %
>
> Allocated_area:
>   Data,RAID0: Size:921.75MB, Used:256.00KB
>  /dev/vdc 307.25MB
>  /dev/vdb 307.25MB
>  /dev/vdd 307.25MB
>
>   Data,Single: Size:8.00MB, Used:0.00
>  /dev/vdb   8.00MB
>
>   System,RAID1: Size:8.00MB, Used:4.00KB
>  /dev/vdd   8.00MB
>  /dev/vdc   8.00MB
>
>   System,Single: Size:4.00MB, Used:0.00
>  /dev/vdb   4.00MB
>
>   Metadata,RAID1: Size:460.94MB, Used:24.00KB
>  /dev/vdb 460.94MB
>  /dev/vdd 460.94MB
>
>   Metadata,Single: Size:8.00MB, Used:0.00
>  /dev/vdb   8.00MB
>
>   Unused:
>  /dev/vdb   2.23GB
>  /dev/vdc   2.69GB
>  /dev/vdd   2.24GB

Couple minor things, in order of personal opinion of severity:

* Devices should be listed in a consistent order; device names are
just too consistently similar

* System chunks shouldn't be listed between data and metadata; really,
they're just noise 99% of the time anyway

* I think it may be more useful to display each disk, with the
profiles in use underneath.  With a larger number of disks, that would
make it _much_ easier to tell at-a-glance what is currently on a disk
(that I may want to remove, or which I may suspect to be unreliable).

* I'd rename "Unused" to "Unallocated" for consistency with the section title

* (and I still detest the_underscores_between_all_the_words; it
doesn't make parsing significantly easier, and it's an eyesore)

* Three coats of blue paint plus a clear-coat is the One True Paint-Job.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs seems to do COW while inode has NODATACOW set

2012-10-25 Thread cwillu
On Thu, Oct 25, 2012 at 12:35 PM, Alex Lyakas
 wrote:
> Hi everybody,
> I need some help understanding the nodatacow behavior.
>
> I have set up a large file (5GiB), which has very few EXTENT_DATAs
> (all are real, not bytenr=0). The file has NODATASUM and NODATACOW
> flags set (flags=0x3):
> item 4 key (257 INODE_ITEM 0) itemoff 3591 itemsize 160
> inode generation 5 transid 5 size 5368709120 nbytes 5368709120
> owner[0:0] mode 100644
> inode blockgroup 0 nlink 1 flags 0x3 seq 0
> item 7 key (257 EXTENT_DATA 131072) itemoff 3469 itemsize 53
> item 8 key (257 EXTENT_DATA 33554432) itemoff 3416 itemsize 53
> item 9 key (257 EXTENT_DATA 67108864) itemoff 3363 itemsize 53
> item 10 key (257 EXTENT_DATA 67112960) itemoff 3310 itemsize 53
> item 11 key (257 EXTENT_DATA 67117056) itemoff 3257 itemsize 53
> item 12 key (257 EXTENT_DATA 67121152) itemoff 3204 itemsize 53
> item 13 key (257 EXTENT_DATA 67125248) itemoff 3151 itemsize 53
> item 14 key (257 EXTENT_DATA 67129344) itemoff 3098 itemsize 53
> item 15 key (257 EXTENT_DATA 67133440) itemoff 3045 itemsize 53
> item 16 key (257 EXTENT_DATA 67137536) itemoff 2992 itemsize 53
> item 17 key (257 EXTENT_DATA 67141632) itemoff 2939 itemsize 53
> item 18 key (257 EXTENT_DATA 67145728) itemoff 2886 itemsize 53
> item 19 key (257 EXTENT_DATA 67149824) itemoff 2833 itemsize 53
> item 20 key (257 EXTENT_DATA 67153920) itemoff 2780 itemsize 53
> item 21 key (257 EXTENT_DATA 67158016) itemoff 2727 itemsize 53
> item 22 key (257 EXTENT_DATA 67162112) itemoff 2674 itemsize 53
> item 23 key (257 EXTENT_DATA 67166208) itemoff 2621 itemsize 53
> item 24 key (257 EXTENT_DATA 67170304) itemoff 2568 itemsize 53
> item 25 key (257 EXTENT_DATA 67174400) itemoff 2515 itemsize 53
> extent data disk byte 67174400 nr 5301534720
> extent data offset 0 nr 5301534720 ram 5301534720
> extent compression 0
> As you see by last extent, the file size is exactly 5Gib.
>
> Then I also mount btrfs with nodatacow option.
>
> root@vc:/btrfs-progs# ./btrfs fi df /mnt/src/
> Data: total=5.47GB, used=5.00GB
> System: total=32.00MB, used=4.00KB
> Metadata: total=512.00MB, used=28.00KB
>
> (I have set up block groups myself by playing with mfks code and
> convertion code to learn about the extent tree. The filesystem passes
> btrfsck fine, with no errors. All superblock copies are consistent.)
>
> Then I run parallel random IOs on the file, and almost immediately hit
> ENOSPC. When looking at the file, I see that now it has a huge amount
> of EXTENT_DATAs:
> item 4 key (257 INODE_ITEM 0) itemoff 3593 itemsize 160
> inode generation 5 transid 21 size 5368709120 nbytes 5368709120
> owner[0:0] mode 100644
> inode blockgroup 0 nlink 1 flags 0x3 seq 130098
> item 6 key (257 EXTENT_DATA 0) itemoff 3525 itemsize 53
> item 7 key (257 EXTENT_DATA 131072) itemoff 3472 itemsize 53
> item 8 key (257 EXTENT_DATA 262144) itemoff 3419 itemsize 53
> item 9 key (257 EXTENT_DATA 524288) itemoff 3366 itemsize 53
> item 10 key (257 EXTENT_DATA 655360) itemoff 3313 itemsize 53
> item 11 key (257 EXTENT_DATA 1310720) itemoff 3260 itemsize 53
> item 12 key (257 EXTENT_DATA 1441792) itemoff 3207 itemsize 53
> item 13 key (257 EXTENT_DATA 2097152) itemoff 3154 itemsize 53
> item 14 key (257 EXTENT_DATA 2228224) itemoff 3101 itemsize 53
> item 15 key (257 EXTENT_DATA 2752512) itemoff 3048 itemsize 53
> item 16 key (257 EXTENT_DATA 2883584) itemoff 2995 itemsize 53
> item 17 key (257 EXTENT_DATA 11927552) itemoff 2942 itemsize 53
> item 18 key (257 EXTENT_DATA 12058624) itemoff 2889 itemsize 53
> item 19 key (257 EXTENT_DATA 13238272) itemoff 2836 itemsize 53
> item 20 key (257 EXTENT_DATA 13369344) itemoff 2783 itemsize 53
> item 21 key (257 EXTENT_DATA 16646144) itemoff 2730 itemsize 53
> item 22 key (257 EXTENT_DATA 16777216) itemoff 2677 itemsize 53
> item 23 key (257 EXTENT_DATA 17432576) itemoff 2624 itemsize 53
> ...
>
> and:
> root@vc:/btrfs-progs# ./btrfs fi df /mnt/src/
> Data: total=5.47GB, used=5.46GB
> System: total=32.00MB, used=4.00KB
> Metadata: total=512.00MB, used=992.00KB
>
> Kernel is for-linus branch from Chris's tree, up to
> f46dbe3dee853f8a860f889cb2b7ff4c624f2a7a (this is the last commit
> there now).
>
> I was under impression that if a file is marked as NODATACOW, then new
> writes will never allocate EXTENT_DATAs if appropriate EXTENT_DATAs
> already exist. However, it is clearly not the case, or maybe I am
> doing something wrong.
>
> Can anybody please help me to debug further and understand why this is
> happening.

Have there been any snapshots taken, and/or was the filesystem
converted from ext?  In those cases, there will be one final copy
taken for for the write.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
th

Re: [PATCH 2/2] Btrfs: do not delete a subvolume which is in a R/O subvolume

2012-10-24 Thread cwillu
On Wed, Oct 24, 2012 at 4:03 AM, Miao Xie  wrote:
> On Mon, 22 Oct 2012 05:57:12 -0600, cwillu wrote:
>> On Mon, Oct 22, 2012 at 5:39 AM, Miao Xie  wrote:
>>> Step to reproduce:
>>>  # mkfs.btrfs 
>>>  # mount  
>>>  # btrfs sub create /subv0
>>>  # btrfs sub snap  /subv0/snap0
>>>  # change /subv0 from R/W to R/O
>>>  # btrfs sub del /subv0/snap0
>>>
>>> We deleted the snapshot successfully. I think we should not be able to 
>>> delete
>>> the snapshot since the parent subvolume is R/O.
>>
>> snap0 isn't read-only in that case, right?  From a user interaction
>> standpoint, this seems like it just forces a user to rm -rf rather
>> btrfs sub del, which strikes me as a bit ham-handed when all we really
>> care about is leaving a (the?) directory entry where snap0 used to be.
>>
>
> I don't think we can identify "btrfs sub del" with "rm -rf", because "rm -rf"
> will check the permission of the parent directory of each file/directory which
> is going to be deleted, but "btrfs sub del" doesn't do it, it will see all the
> file/directory in the subvolume as one, so I think it seems like a special
> "rmdir". From this standpoint, deleting a snapshot whose parent subvolume
> is readonly should be forbidden.

Sorry; reading back, I misunderstood you to mean that subv0 was marked
as a readonly subvolume, as opposed to marking the mountpoint
readonly.  The former can't work at all (it would make the pair
undeletable, as the subv0 can't be deleted while it contains another
subvolume).

I'm still not sure that the latter is quite right, but I care a lot
less as one could always remount it rw (unlike ro subvolumes, as I
understand them).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: do not delete a subvolume which is in a R/O subvolume

2012-10-22 Thread cwillu
On Mon, Oct 22, 2012 at 5:39 AM, Miao Xie  wrote:
> Step to reproduce:
>  # mkfs.btrfs 
>  # mount  
>  # btrfs sub create /subv0
>  # btrfs sub snap  /subv0/snap0
>  # change /subv0 from R/W to R/O
>  # btrfs sub del /subv0/snap0
>
> We deleted the snapshot successfully. I think we should not be able to delete
> the snapshot since the parent subvolume is R/O.

snap0 isn't read-only in that case, right?  From a user interaction
standpoint, this seems like it just forces a user to rm -rf rather
btrfs sub del, which strikes me as a bit ham-handed when all we really
care about is leaving a (the?) directory entry where snap0 used to be.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unrecognized mount option 'compression=lzo' and defragment -c errors

2012-10-20 Thread cwillu
> 1. I also added mount option 'compression=lzo' and 'io_cache' to /home at 
> first.

Neither io_cache nor compression=lzo are options that exist.  You
probably meant compress=lzo for the first, but I really don't know
what you wanted for io_cache (inode_cache?  that's not really a
performance thing)

You need to check what the actual parameters are before you change
things.  Making stuff up simply doesn't work.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Weird Warning

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 3:51 PM, Jérôme Poulin  wrote:
> After updating to 3.5.5, I get thi on boot and listing some dir freezes.
> I don't have anything important on that volume but I'm willing to
> debug the problem if needed.  Would I need a more recent kernel?

Probably worth trying 3.7-rc1, or at least cmason's for-linus (which
is 3.6.0 + the btrfs changes that went into 3.7).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Weird Warning

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 2:54 PM, Jérôme Poulin  wrote:
> I've got this weird WARNING in my system log on a freshly created FS,
> I'm using ACL with Samba, this is the only difference I could tell
> from any other FSes. It is also using Debian's Wheezy kernel which is
> quite old. Should I just ignore this or update BTRFS module?

I would strongly recommend updating, even if you hadn't seen any warnings.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: initramfs take a long time to load[135s]

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 1:02 PM, Marguerite Su  wrote:
> On Sat, Oct 20, 2012 at 2:35 AM, cwillu  wrote:
>> Without space_cache (once), btrfs has to repopulate that information
>> the slow way every mount; with it, it can just load the data from the
>> last unmount (modulo some consistency checks).
>>
>> The setting is sticky, so you don't actually need it in fstab any more
>> (although it won't hurt anything either).
>
> Thanks, cwillu!
>
> I transfer the message to openSUSE bugzilla and ask them help making
> that happen by default in openSUSE.
>
> Marguerite

Apparently mkfs.btrfs does set it by default now, so perhaps your
filesystem predates the change, or suse's btrfs-progs is too old.

mkfs.btrfs /dev/whatever followed by mounting with no options should
print "btrfs: disk space caching is enabled" to dmesg if your mkfs is
new enough, if you wish to test.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: initramfs take a long time to load[135s]

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 12:33 PM, Marguerite Su  wrote:
> On Sat, Oct 20, 2012 at 2:26 AM, cwillu  wrote:
>> That would work, but it's only necessary to mount with it once (and
>> it's probably been done already with /home), hence the -o
>> remount,space_cache
>
> Now my kernel loads in 10s, another 4s for userspace...then -.mount
> and all the systemd services.
>
> It boots like an animal!

Without space_cache (once), btrfs has to repopulate that information
the slow way every mount; with it, it can just load the data from the
last unmount (modulo some consistency checks).

The setting is sticky, so you don't actually need it in fstab any more
(although it won't hurt anything either).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: initramfs take a long time to load[135s]

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 11:02 AM, Marguerite Su  wrote:
> On Sat, Oct 20, 2012 at 12:55 AM, cwillu  wrote:
>> It appears space_cache isn't enabled on your rootfs; can you do a
>> "mount / -o remount,space_cache", sync a couple times, make some
>> coffee, and then reboot, and see if it's better?
>>
>> You should see two instances of "btrfs: disk space caching is enabled"
>> in your dmesg, one for / and the second for /home.
>>
>> Also, make sure to reply-all so that others interested can still follow 
>> along.
>
> like this
>
> UUID=9b9aa9d9-760e-445c-a0ab-68e102d9f02e /btrfs
>defaults,space_cache,comment=systemd.automount  1 0
>
> UUID=559dec06-4fd0-47c1-97b8-cc4fa6153fa0 /homebtrfs
> defaults,space_cache,comment=systemd.automount  1 0
>
> in /etc/fstab?

That would work, but it's only necessary to mount with it once (and
it's probably been done already with /home), hence the -o
remount,space_cache
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: initramfs take a long time to load[135s]

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 10:18 AM, Marguerite Su  wrote:
> On Fri, Oct 19, 2012 at 11:41 PM, cwillu  wrote:
>> Also, next time just put the output directly in the email, that way
>> it's permanently around to look at and search for.
>
> Hi,
>
> I did it. here's my dmesg:

> I made the snapshot at:
>
> mount -o rw,defaults,comment=systemd.automount -t btrfs /dev/root /root
>
> and
>
> Starting Tell Plymouth To Write Out Runtime Data...
> Started Recreate Volatile Files and Directories
>
>
> is it useful this time?

More useful every time!

Can you post the full output of dmesg, or at least the first couple
hundred seconds of it?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: initramfs take a long time to load[135s]

2012-10-19 Thread cwillu
On Fri, Oct 19, 2012 at 9:28 AM, Marguerite Su  wrote:
> On Thu, Oct 18, 2012 at 9:28 PM, Chris Mason  wrote:
>> If it isn't the free space cache, it'll be a fragmentation problem.  The
>> easiest way to tell the difference is to get a few sysrq-w snapshots
>> during the boot.
>
> Hi, Chris,
>
> with some help from openSUSE community, I learnt what's sysrq
> snapshots(alt+printscreen+w in tty1)...
>
> and here's my log:
>
> http://paste.opensuse.org/31094916

You need to hit alt-sysrq-w during the slowness you're trying to
instrument; the pastebin is from an hour later.

Also, next time just put the output directly in the email, that way
it's permanently around to look at and search for.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS filesystem is not mountable after crash

2012-10-13 Thread cwillu
On Sat, Oct 13, 2012 at 11:51 AM, Alfred Zastrow  wrote:
> Am 26.08.2012 08:17, schrieb Liu Bo:
>
>> On 08/26/2012 01:27 PM, Alfred Zastrow wrote:
>>
>>
>> Hello,
>>
>> has realy nobody a hint for me?
>>
>> Is compiling chris's latest for-linus helpful?
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git
>>
>> thanks,
>> liubo
>
>
> Hi dev's
>
> I was not able to install chris's latest for-linus under F17, but I tried
> with the latest 3.6.1-Kernel with was recently released.
> Same shit..  :-(

Chris's for-linus is currently all the btrfs changes that will be
going into 3.7.  3.6.1 won't likely have any of them.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   >