Re: Any chance to get snapshot-aware defragmentation?

2018-05-21 Thread Niccolò Belli

On domenica 20 maggio 2018 12:59:28 CEST, Tomasz Pala wrote:

On Sat, May 19, 2018 at 10:56:32 +0200, Niccol? Belli wrote:

snapper users with hourly snapshots will not have any use for it.

Anyone with hourly snapshots anyone is doomed anyway.


I do not agree: having hourly snapshots doesn't mean you cannot limit 
snapshots at a reasonable number. In fact you can simply keep a dozen of 
them, then start discarding the older ones.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Any chance to get snapshot-aware defragmentation?

2018-05-19 Thread Niccolò Belli

On sabato 19 maggio 2018 01:55:30 CEST, Tomasz Pala wrote:

The "defrag only not-snapshotted data" mode would be enough for many
use cases and wouldn't require more RAM. One could run this before
taking a snapshot and merge _at least_ the new data.


snapper users with hourly snapshots will not have any use for it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Any chance to get snapshot-aware defragmentation?

2018-05-19 Thread Niccolò Belli

On venerdì 18 maggio 2018 20:33:53 CEST, Austin S. Hemmelgarn wrote:
With a bit of work, it's possible to handle things sanely.  You 
can deduplicate data from snapshots, even if they are read-only 
(you need to pass the `-A` option to duperemove and run it as 
root), so it's perfectly reasonable to only defrag the main 
subvolume, and then deduplicate the snapshots against that (so 
that they end up all being reflinks to the main subvolume).  Of 
course, this won't work if you're short on space, but if you're 
dealing with snapshots, you should have enough space that this 
will work (because even without defrag, it's fully possible for 
something to cause the snapshots to suddenly take up a lot more 
space).


Been there, tried that. Unfortunately even if I skip the defreg a simple

duperemove -drhA --dedupe-options=noblock --hashfile=rootfs.hash rootfs

is going to eat more space than it was previously available (probably due 
to autodefrag?).


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Any chance to get snapshot-aware defragmentation?

2018-05-18 Thread Niccolò Belli

On venerdì 18 maggio 2018 19:10:02 CEST, Austin S. Hemmelgarn wrote:
and also forces the people who have ridiculous numbers of 
snapshots to deal with the memory usage or never defrag


Whoever has at least one snapshot is never going to defrag anyway, unless 
he is willing to double the used space.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Any chance to get snapshot-aware defragmentation?

2018-05-18 Thread Niccolò Belli

On venerdì 18 maggio 2018 18:20:51 CEST, David Sterba wrote:

Josef started working on that in 2014 and did not finish it. The patches
can be still found in his tree. The problem is in excessive memory
consumption when there are many snapshots that need to be tracked during
the defragmentation, so there are measures to avoid OOM. There's
infrastructure ready for use (shrinkers), there are maybe some problems
but fundamentally is should work.

I'd like to get the snapshot-aware working again too, we'd need to find
a volunteer to resume the work on the patchset.


Yeah I know of Josef's work, but 4 years had passed since then without any 
news on this front.


What I would really like to know is why nobody resumed his work: is it 
because it's impossible to implement snapshot-aware degram without 
excessive ram usage or is it simply because nobody is interested?


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs installation advices

2018-05-13 Thread Niccolò Belli

On martedì 8 maggio 2018 09:50:23 CEST, Rolf Wald wrote:

You need to build three partitions, e.g. named boot, swap, root.


You don't need to use an unencrypted boot if you use grub:
https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Encrypted_boot_partition_.28GRUB.29

A few hints for btrfs + LUKS + swap:
https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Btrfs_subvolumes_with_swap

Another solution is to use SED, as someone mentioned:
https://wiki.archlinux.org/index.php/Self-Encrypting_Drives

The only downside is that you can rest assured there will be NSA backdoors 
in hardware crypto.


Even better I suggest you to move to ZFS and use Native Encryption:
https://github.com/zfsonlinux/zfs/pull/5769

I recently got tired of btrfs never implementing things like snapshot-aware 
defrag (with no signs on the horizon that this is going to change soon) so 
I decided to switch my servers to ZFS. I'll let you know how crypto works 
if you're interested. I'll keep using btrfs on the clients though, at least 
for now.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Any chance to get snapshot-aware defragmentation?

2018-05-11 Thread Niccolò Belli

Hi,
I'm waiting for this feature since years and initially it seemed like 
something which would have been worked on, sooner or later.
A long time had passed without any progress on this, so I would like to 
know if there is any technical limitation preventing this or if it's 
something which could possibly land in the near future.


Thanks,
Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kickstarting snapshot-aware defrag?

2017-10-03 Thread Niccolò Belli

Hi,
It seems to me that the proposal[1] for a snapshot-aware defrag has long 
been abandoned. Since most peoples badly need this feature I tought about 
how to possibly speed up the achievement of this goal.


I know of several bounty-based kickstarting platforms, among them the best 
ones are probably bountysource.com[2] and freedomsponsors.org[3].
With both platforms everyone interested can place a bounty on the issue and 
if/when there will be someone interested to implement it he will get the 
bounty.
I created an issue on both of them just to show how the platform will 
handle it.


Since btrfs is a small community, before actually placing bounties and 
sponsoring it I would like to know if there is someone against this 
development model or someone interested in implementing a feature because 
of a bounty.


Bests,
Niccolò


[1]https://www.spinics.net/lists/linux-btrfs/msg34539.html
[2]https://www.bountysource.com/issues/50004702-feature-request-snapshot-aware-defrag
[3]https://freedomsponsors.org/issue/817/feature-request-snapshot-aware-defrag?alert=KICKSTART
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why do full balance and deduplication reduce available free space?

2017-10-02 Thread Niccolò Belli

Il 2017-10-02 21:35 Kai Krakow ha scritto:

Besides defragging removing the reflinks, duperemove will unshare your
snapshots when used in this way: If it sees duplicate blocks within the
subvolumes you give it, it will potentially unshare blocks from the
snapshots while rewriting extents.

BTW, you should be able to use duperemove with read-only snapshots if
used in read-only-open mode. But I'd rather suggest to use bees
instead: It works at whole-volume level, walking extents instead of
files. That way it is much faster, doesn't reprocess already
deduplicated extents, and it works with read-only snapshots.

Until my patch it didn't like mixed nodatasum/datasum workloads.
Currently this is fixed by just leaving nocow data alone as users
probably set nocow for exactly the reason to not fragment extents and
relocate blocks.


Bad Btrfs Feature Interactions: btrfs read-only snapshots (never tested, 
probably wouldn't work well)


Unfortunately it seems that bees doesn't support read-only snapshots, so 
it's a no way.


P.S.
I tried duperemove with -A, but besides taking much longer it didn't 
improve the situation.
Are you sure that the culprit is duperemove? AFAIK it shouldn't unshare 
extents...


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why do full balance and deduplication reduce available free space?

2017-10-02 Thread Niccolò Belli
Maybe this is because of the autodefrag mount option? I thought it 
wasn't supposed to unshare lots of extents...


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Is it really possible to dedupe read-only snapshots!?

2017-10-02 Thread Niccolò Belli

Il 2017-10-02 13:14 Paul Jones ha scritto:

I use bees for deduplication and it will quite happily dedupe
read-only snapshots.


AFAIK no, it isn't possible. Source: 
https://www.spinics.net/lists/linux-btrfs/msg60385.html
"It should be possible to deduplicate a read-only file to a read-write 
one, but that's probably not worth the effort in many real-world use 
cases."



You could always change them to RW while dedupe
is running then change back to RO.


AFAIK it will break send/receive, can someone confirm?

Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why do full balance and deduplication reduce available free space?

2017-10-02 Thread Niccolò Belli

Il 2017-10-02 12:16 Hans van Kranenburg ha scritto:

On 10/02/2017 12:02 PM, Niccolò Belli wrote:

[...]

Since I use lots of snapshots [...] I had to
create a systemd timer to perform a full balance and deduplication 
each

night.


Can you explain what's your reasoning behind this 'because X it needs
Y'? I don't follow.


Available free space is important to me, so I want snapshots to be 
deduplicated as well. Since I cannot deduplicate snapshots because they 
are read-only, then the data must be already deduplicated before the 
snapshots are taken. I do not consider the hourly snapshots because in a 
day they will be gone anyway, but daily snapshots will stay there for 
much longer so I want them to be deduplicated.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Why do full balance and deduplication reduce available free space?

2017-10-02 Thread Niccolò Belli

Hi,
I have several subvolumes mounted with compress-force=lzo and autodefrag. 
Since I use lots of snapshots (snapper keeps around 24 hourly snapshots, 7 
daily snapshots and 4 weekly snapshots) I had to create a systemd timer to 
perform a full balance and deduplication each night. In fact data needs to 
be already deduplicated when snapshots are created, otherwise I have no 
other way to deduplicate snapshots.


This is how I performe balance: btrfs balance start --full-balance rootfs
This is how I perform deduplication (duperemove is from git master):
duperemove -drh --dedupe-options=noblock --hashfile=../rootfs.hash 



Looking at the logs I noticed something weird: available free space 
actually decreases after balance or deduplication.


This is just before the timer starts:

Overall:
   Device size: 128.00GiB
   Device allocated: 49.03GiB
   Device unallocated:   78.97GiB
   Device missing:  0.00B
   Used: 43.78GiB
   Free (estimated): 82.97GiB  (min: 82.97GiB)
   Data ratio:   1.00
   Metadata ratio:   1.00
   Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:44.00GiB, Used:40.00GiB
  /dev/sda5  44.00GiB

Metadata,single: Size:5.00GiB, Used:3.78GiB
  /dev/sda5   5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
  /dev/sda5  32.00MiB

Unallocated:
  /dev/sda5  78.97GiB



I also manually performed a full balance just before the timer starts:

Overall:
   Device size: 128.00GiB
   Device allocated: 46.03GiB
   Device unallocated:   81.97GiB
   Device missing:  0.00B
   Used: 43.78GiB
   Free (estimated): 82.96GiB  (min: 82.96GiB)
   Data ratio:   1.00
   Metadata ratio:   1.00
   Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:41.00GiB, Used:40.01GiB
  /dev/sda5  41.00GiB

Metadata,single: Size:5.00GiB, Used:3.77GiB
  /dev/sda5   5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
  /dev/sda5  32.00MiB

Unallocated:
  /dev/sda5  81.97GiB



As you can see even doing a full balance was enough to reduce the available 
free space!


Then the timer started and it performed the deduplication:

Overall:
   Device size: 128.00GiB
   Device allocated: 46.03GiB
   Device unallocated:   81.97GiB
   Device missing:  0.00B
   Used: 43.87GiB
   Free (estimated): 82.94GiB  (min: 82.94GiB)
   Data ratio:   1.00
   Metadata ratio:   1.00
   Global reserve:  512.00MiB  (used: 176.00KiB)

Data,single: Size:41.00GiB, Used:40.03GiB
  /dev/sda5  41.00GiB

Metadata,single: Size:5.00GiB, Used:3.84GiB
  /dev/sda5   5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
  /dev/sda5  32.00MiB

Unallocated:
  /dev/sda5  81.97GiB



Once again it reduced the available free space!

Then, after the deduplication, the timer also performed a full balance:

Overall:
   Device size: 128.00GiB
   Device allocated: 46.03GiB
   Device unallocated:   81.97GiB
   Device missing:  0.00B
   Used: 44.00GiB
   Free (estimated): 82.93GiB  (min: 82.93GiB)
   Data ratio:   1.00
   Metadata ratio:   1.00
   Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:41.00GiB, Used:40.04GiB
  /dev/sda5  41.00GiB

Metadata,single: Size:5.00GiB, Used:3.97GiB
  /dev/sda5   5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
  /dev/sda5  32.00MiB

Unallocated:
  /dev/sda5  81.97GiB




It further reduced the available free space! Balance and deduplication 
actually reduced my available free space of 400MB!

400MB each night!
How is it possible? Should I avoid doing balances and deduplications at 
all?


Thanks,
Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cannot fix btrfs errors after system crash

2017-09-27 Thread Niccolò Belli

Hi,
I was trying to use AMDGPU-PRO's OpenCL stack (with the mainline 4.12.13 
kernel) while it suddently crashed the whole system, not even magic sysrq 
keys did work anymore.
With no surprise, at the next reboot I found several btrfs warnings (see 
https://paste.pound-python.org/show/S5zBG2tXZUTLG699saE5/).
Since btrfs scrub didn't find any error I decided to reboot into a live usb 
and start a btrfs check (I'm using btrfs-progs 4.13).
It did found lots of errors indeed (see 
https://paste.pound-python.org/show/IPxh9sly0EEb0MKPi2dw/).
So I made a full backup with dd and I started a btrfs check --repair (see 
https://paste.pound-python.org/show/c9AlT8ehKKJy6l5xhzXk/).

I also wiped the space cache with --clear-space-cache v1.
A subsequent btrfs check revealed it indeed fixed lots of errors (see 
https://paste.pound-python.org/show/1m2Wodd1q3n0eRlxLpZB/), but 
unfortunately i still have the following errors:


unresolved ref dir 7450239 index 2 namelen 6 name 431886 filetype 1 errors 
80, filetype mismatch
unresolved ref dir 7450595 index 2 namelen 6 name 431886 filetype 1 errors 
80, filetype mismatch
unresolved ref dir 7457122 index 2 namelen 6 name 431886 filetype 1 errors 
80, filetype mismatch


I'm already quite satisfied to be honest: two years ago repair used to eat 
my data, making things worse.
Anyway, why didn't btrfs-check repair them? Is there anything I can do to 
fix them?


Thanks,
Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID56 status?

2017-01-24 Thread Niccolò Belli

+1

On martedì 24 gennaio 2017 00:31:42 CET, Christoph Anton Mitterer wrote:

On Mon, 2017-01-23 at 18:18 -0500, Chris Mason wrote:

We've been focusing on the single-drive use cases internally.  This
year 
that's changing as we ramp up more users in different places.  
Performance/stability work and raid5/6 are the top of my list right

now.

+1

Would be nice to get some feedback on what happens behind the scenes...
 actually I think a regular btrfs development blog could be generally a
nice thing :)

Cheers,
Chris.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert from RAID 5 to 10

2016-12-01 Thread Niccolò Belli

On giovedì 1 dicembre 2016 10:37:13 CET, Wilson Meier wrote:

The only thing i have asked for is to document the *known*
problems/flaws/limitations of all raid profiles and link to them from
the stability matrix.


+1

Do someone mind if I ask for an account and I start copy-pasting any 
relevant post in this thread?


Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert from RAID 5 to 10

2016-11-30 Thread Niccolò Belli

I completely agree, the whole wiki status is simply *FRUSTRATING*.

Niccolò Belli

On mercoledì 30 novembre 2016 14:12:36 CET, Wilson Meier wrote:

Am 30/11/16 um 11:41 schrieb Duncan:

Wilson Meier posted on Wed, 30 Nov 2016 09:35:36 +0100 as excerpted:
 ...

Hi Duncan,

i understand your arguments but cannot fully agree.
First of all, i'm not sticking with old stale versions of whatever as i
try to keep my system up2date.
My kernel is 4.8.4 (Gentoo) and btrfs-progs is 4.8.4.
That being said, i'm quite aware of the heavy development status of
btrfs but pointing the finger on the users saying that they don't fully
understand the status of btrfs without giving the information on the
wiki is in my opinion not the right way. Heavy development doesn't mean
that features marked as ok are "not" or "mostly" ok in the context of
overall btrfs stability.
There is no indication on the wiki that raid1 or every other raid
(except for raid5/6) suffers from the problems stated in this thread.
If there are know problems then the stability matrix should point them
out or link to a corresponding wiki entry otherwise one has to assume
that the features marked as "ok" are in fact "ok".
And yes, the overall btrfs stability should be put on the wiki.

Just to give you a quick overview of my history with btrfs.
I migrated away from MD Raid and ext4 to btrfs raid6 because of its CoW
and checksum features at a time as raid6 was not considered fully stable
but also not as badly broken.
After a few months i had a disk failure and the raid could not recover.
I looked at the wiki an the mailing list and noticed that raid6 has been
marked as badly broken :(
I was quite happy to have a backup. So i asked on the btrfs IRC channel
(the wiki had no relevant information) if raid10 is usable or suffers
from the same problems. The summary was "Yes it is usable and has no
known problems". So i migrated to raid10. Now i know that raid10 (marked
as ok) has also problems with 2 disk failures in different stripes and
can in fact lead to data loss.
I thought, hmm ok, i'll split my data and use raid1 (marked as ok). And
again the mailing list states that raid1 has also problems in case of
recovery.

It is really disappointing to not have this information in the wiki
itself. This would have saved me, and i'm quite sure others too, a lot
of time.
Sorry for being a bit frustrated.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Not TLS] mount option nodatacow for VMs on SSD?

2016-11-29 Thread Niccolò Belli

On martedì 29 novembre 2016 06:14:18 CET, Duncan wrote:
Very good question that I don't know the answer to as I've not seen it 
discussed previously.  (I'm not a dev, just a list regular and user of 
btrfs myself, and my personal use-case involves neither snapshots nor 
send/receive, so on those topics if I've not seen it covered previously 
either here or on the wiki, I won't know.)


Someone else know?


Sounds too good to be real, I somehow feel the answer will be "no" :(

Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mount option nodatacow for VMs on SSD?

2016-11-28 Thread Niccolò Belli

On lunedì 28 novembre 2016 09:20:15 CET, Kai Krakow wrote:

You can, however, use chattr to make the subvolume root directory (that
one where it is mounted) nodatacow (chattr +C) _before_ placing any
files or directories in there. That way, newly created files and
directories will inherit the flag. Take note that this flag can only
applied to directories and empty (zero-sized) files.


Do I keep checksumming for this directory such a way?

Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


My system mounts the wrong btrfs partition, from the wrong disk!

2016-11-25 Thread Niccolò Belli
This is something pretty unbelievable, so I had to repeat it several times 
before finding the courage to actually post it to the mailing list :)


After dozens of data loss I don't trust my btrfs partition that much, so I 
make a backup copy with dd weekly. Yesterday I was going to do some 
balancing and deduplication, but since the disk was full I had to remove 
the content of a whole subvolume (16GB) to make some space available to the 
tools (I had the backup made with dd on an external drive).


After the balance and deduplication I attached and mounted the external 
drive with the backup and then I mounted the img file with the copy of the 
partition. To my great wonder the subvolume in the backup which should have 
had the files I deleted was empty! So I rebooted to a live usb and mounted 
the backup again: my files were still there, phew! Then I mounted the 
partition in the laptop's disk and I tried to copy the files from the 
backup and it complained that they already existed! If I unmount both the 
backup and the real disk and then I mount the real disk it's empty again as 
it should.


These are the exact steps to reproduce it from the live usb:

# Opening the encrypted partition from the real disk and then mounting it
cryptsetup luksOpen /dev/sda5 cryptroot
mount -o noatime,compress=lzo,autodefrag /dev/mapper/cryptroot /real_disk

ls /real_disk/@Pictures --> empty (as it should)

# Mounting the external disk with the backup
mount /dev/sdb1 /external_disk

# Mounting the unencrypted backup from the external disk
mount /external_disk/backup.img /backup

ls /backup/@Pictures ---> empty (*it shouldn't!*)

umount /backup
umount /external_disk
cryptsetup luksClose cryptroot

# Mounting the external disk with the backup
mount /dev/sdb1 /external_disk

# Mounting the unencrypted backup from the external disk
mount /external_disk/backup.img /backup

ls /backup/@Pictures ---> 16GB of photos (as it should)

# Opening the encrypted partition from the real disk and then mounting it
cryptsetup luksOpen /dev/sda5 cryptroot
mount -o noatime,compress=lzo,autodefrag /dev/mapper/cryptroot /real_disk

ls /real_disk/@Pictures --> 16GB of photos (it *shouldn't!*)


I really don't know where the bug may lie, probably not even in btrfs but I 
didn't know where to report it. I'm using Archlinux with kernel 4.8.10 and 
the live is an Arch live usb with kernel 4.8 too.


Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Increased disk usage after deduplication and system running out of memory

2016-11-24 Thread Niccolò Belli
  
   
   

   
   
   

Unallocated:
   
   

  /dev/mapper/cryptroot  16.35GiB  
   
   




$ cat after_duperemove_and_balance  
   
   
  
Overall:
   
   

   Device size: 152.36GiB  
   
   

   Device allocated:136.03GiB  
   
   

   Device unallocated:   16.33GiB  
   
   

   Device missing:  0.00B  
   
   

   Used:133.81GiB  
   
   

   Free (estimated): 16.55GiB  (min: 16.55GiB) 
   
   

   Data ratio:   1.00  
   
   

   Metadata ratio:   1.00  
   
   

   Global reserve:  512.00MiB  (used: 0.00B)


Data,single: Size:127.00GiB, Used:126.77GiB
  /dev/mapper/cryptroot 127.00GiB

Metadata,single: Size:9.00GiB, Used:7.03GiB
  /dev/mapper/cryptroot   9.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
  /dev/mapper/cryptroot  32.00MiB

Unallocated:
  /dev/mapper/cryptroot  16.33GiB


As you can see it freed 5.41 GB of data, but it also added 5.24 GB of 
metadata. The estimated free space is now 16.55 GB, while before the 
deduplication it was higher: 17.17 GB.


This is when running duperemove git with noblock, but almost nothing 
changes if I omitt it (it defaults to block).
Why did my metadata increase by a 4x factor? 99% of my data already had 
shared extents because of snapshots, so why such a huge increase?


Deduplication didn't finish up to 100%, because duperemove got killed by 
OOM killer at 99%: 
https://paste.pound-python.org/show/yUcIOSzXcrfNPkF9rV2L/


As you can see from dmesg 
(https://paste.pound-python.org/show/eZIkpxUU6QR9ij6Rn1Oq/) there is no 
process stealing so much memory (my system has 8GB): the biggest one takes 
as much as 700MB of vm.


Another strange thing that you can see from the previous log is that it 
tries to deduplicate /home/niko/nosnap/rootfs/@images/fedora25.qcow2 which 
is a UNIQUE file. Such image is stored in a separate subvolume because I 
don't want it to be snapshotted, so I'm pretty sure there are no other 
copies of this image, but still it tries to deduplicate it.


Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kerne

Re: degraded BTRFS RAID 1 not mountable: open_ctree failed, unable to find block group for 0

2016-11-20 Thread Niccolò Belli

On giovedì 17 novembre 2016 21:20:56 CET, Austin S. Hemmelgarn wrote:

On 2016-11-17 15:05, Chris Murphy wrote:

I think the wiki should be updated to reflect that raid1 and raid10
are mostly OK. I think it's grossly misleading to consider either as
green/OK when a single degraded read write mount creates single chunks
that will then prevent a subsequent degraded read write mount. And
also the lack of various notifications of device faultiness I think
make it less than OK also. It's not in the "do not use" category but
it should be in the middle ground status so users can make informed
decisions.


It's worth pointing out also regarding this:
* This is handled sanely in recent kernels (the check got 
changed from per-fs to per-chunk, so you still have a usable FS 
if all the single chunks are only on devices you still have).
* This is only an issue with filesystems with exactly two 
disks.  If a 3+ disk raid1 FS goes degraded, you still generate 
raid1 chunks.
* There are a couple of other cases where raid1 mode falls flat 
on it's face (lots of I/O errors in a short span of time with 
compression enabled can cause a kernel panic for example).
* raid10 has some other issues of it's own (you lose two 
devices, your filesystem is dead, which shouldn't be the case 
100% of the time (if you lose different parts of each mirror, 
BTRFS _should_ be able to recover, it just doesn't do so right 
now)).


As far as the failed device handling issues, those are a 
problem with BTRFS in general, not just raid1 and raid10, so I 
wouldn't count those against raid1 and raid10.


Everything you mentioned should be in the wiki IMHO. Knowledge is power.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-dedupe

2016-11-18 Thread Niccolò Belli
  1575   15714810329 220   4 
2473 0 konsole
[ 6342.147739] [ 1579]  1000  1579 4021   56  13   3  
164 0 bash
[ 6342.147741] [ 1582] 0  158217563   23  38   3  
248 0 sudo
[ 6342.147742] [ 1583] 0  1583 3425   53  11   3  
118 0 duperemove.sh
[ 6342.147744] [ 4060] 0  4060   16850192579 203   3   
24 0 duperemove
[ 6342.147746] Out of memory: Kill process 4060 (duperemove) score 21 or 
sacrifice child
[ 6342.147754] Killed process 4060 (duperemove) total-vm:674004kB, 
anon-rss:367672kB, file-rss:2644kB, shmem-rss:0kB




Any idea? The process with the highest total_vm is plasmashell, but it has 
only 900MB of vm.


Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-dedupe

2016-11-16 Thread Niccolò Belli

On martedì 15 novembre 2016 18:52:01 CET, Zygo Blaxell wrote:

Like I said, millions of extents per week...

64K is an enormous dedup block size, especially if it comes with a 64K
alignment constraint as well.

These are the top ten duplicate block sizes from a sample of 95251
dedup ops on a medium-sized production server with 4TB of filesystem
(about one machine-day of data):


Which software do you use to dedupe your data? I tried duperemove but it 
gets killed by the OOM killer because it triggers some kind of memory leak: 
https://github.com/markfasheh/duperemove/issues/163


Niccolò Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-dedupe

2016-11-09 Thread Niccolò Belli

Hi,
What do you think about jdupes? I'm searching an alternative to duperemove 
and rmlint doesn't seem to support btrfs deduplication, so I would like to 
try jdupes. My main problem with duperemove is a memory leak, also it seems 
to lead to greater disk usage: 
https://github.com/markfasheh/duperemove/issues/163


Niccolo' Belli

On martedì 8 novembre 2016 23:36:25 CET, Saint Germain wrote:

Please be aware of these other similar softwares:
- jdupes: https://github.com/jbruchon/jdupes
- rmlint: https://github.com/sahib/rmlint
And of course fdupes.

Some intesting points I have seen in them:
- use xxhash to identify potential duplicates (huge speedup)
- ability to deduplicate read-only snapshots
- identify potential reflinked files (see also my email here:
  https://www.spinics.net/lists/linux-btrfs/msg60081.html)
- ability to filter out hardlinks
- triangle problem: see jdupes readme
- jdupes has started the process to be included in Debian

I hope that will help and that you can share some codes with them !

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-dedupe

2016-11-08 Thread Niccolò Belli

On martedì 8 novembre 2016 17:58:52 CET, James Pharaoh wrote:
Yes, everything you have described here is something I intend 
to create, and might as well include in the tool itself. I'll 
add it to the roadmap ;-)


Sounds good, but I have yet another feature request which is even more 
interesting in my opinion.
If you ever used snapper you probably already found yourself in the 
poisition when you want to free some space and you actually can't, because 
the files you want to delete are already present in countless snapshots. 
Such a way you will have to delete the unwanted files from every snapshot, 
which is tedious task, even more difficult if you moved/renamed these 
files. What I actually do is exploiting duperemove's hashfile to grep for 
the checksum and obtain all the paths. Then I will have to switch the 
snapshots to rw, manually delete each file and finally switch them back to 
ro. A tool which automates these task would be awesome.


Niccolo'
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-dedupe

2016-11-08 Thread Niccolò Belli

On martedì 8 novembre 2016 12:38:48 CET, James Pharaoh wrote:
You can't deduplicate a read-only snapshot, but you can create 
read-write snapshots from them, deduplicate those, and then 
recreate the read-only ones. This is what I've done.


Since snapper creates hundreds of snapshots, isn't this something that the 
deduplication software could do for me if I explicitely tell it to do so? I 
mean momentarily switching the snapshot to rw in order to deduplicate it, 
then switching it back to ro.


In theory, once this has been done once, it shouldn't have to 
be done again, at least for those snapshots, unless you want to 
modify the deduplication. It's probably a good idea to 
defragment files and directories first, as well.


I can't defragment anything, because it would take too much free space to 
do so with so many snapshots. Instead, the deduplication software could 
defragment each file before calling the extent-same ioctl, that would be 
feasible. Such a way you will not need hilarious amounts of free space to 
defragment the fs.


It should be possible to deduplicate a read-only file to a 
read-write one, but that's probably not worth the effort in many 
real-world use cases.


This is exactly what I would expect a deduplication tool to do when it 
encounters a ro snapshot, except when I explicitely tell it to momentarily 
switch the snapshot to rw in order to deduplicate it.


Niccolo' Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Announcing btrfs-dedupe

2016-11-08 Thread Niccolò Belli
Nice, you should probably update the btrfs wiki as well, because there is 
no mention of btrfs-dedupe.


First question, why this name? Don't you plan to support xfs as well?

Second question, I'm trying deduplication tools for the very first time and 
I still have to figure out how to handle snapper snapshots, which are read 
only. I currently tried duperemove 0.11 git and I get tons of "Error 30: 
Read-only file system while opening "/.../@snapshots/4385/...". How am I 
supposed to handle snapper snapshots?


I do not run duperemove from a live distro, instead I run it directly on 
the system I want to deduplicate:


sudo mount -o noatime,compress=lzo,autodefrag /dev/mapper/cryptroot 
/home/niko/nosnap/rootfs/
sudo duperemove -drh --dedupe-options=nofiemap 
--hashfile=/home/niko/nosnap/rootfs.hash /home/niko/nosnap/rootfs/


Is btrfs-dedupe able to handle snapper snapshots?

Thanks,
Niccolo' Belli
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo

2016-05-20 Thread Niccolò Belli

On venerdì 13 maggio 2016 08:11:27 CEST, Duncan wrote:
In theory the various btrfs dedup solutions out there should work as 
well, while letting you keep the snapshots (at least to the extent 
they're either writable snapshots so can be reflink modified


Unfortunately as you said dedup doesn't work with read-only snapshots (I 
only use read-only snapshots with snapper) :(


Does dedup's dedup-syscall branch 
(https://github.com/g2p/bedup/tree/wip/dedup-syscall) which uses the new 
batch deduplication ioctl merged in Linux 3.12 fix this? Unfortunately 
latest commit is from september :(

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-13 Thread Niccolò Belli

On venerdì 13 maggio 2016 13:35:01 CEST, Austin S. Hemmelgarn wrote:
The fact that you're getting an OOPS involving core kernel 
threads (kswapd) is a pretty good indication that either there's 
a bug elsewhere in the kernel, or that something is wrong with 
your hardware.  it's really difficult to be certain if you don't 
have a reliable test case though.


Talking about reliable test cases, I forgot to say that I definitely found 
an interesting one. It doesn't lead to OOPS but perhaps something even more 
interesting. While running countless stress tests I tried running some 
games to stress the system in different ways. I chosed openmw (an open 
source engine for Morrowind) and I played it for a while on my second 
external monitor (while I watched at some monitoring tools on my first 
monitor). I noticed that after playing a while I *always* lose internet 
connection (I use an USB3 Gigabit Ethernet adapter). This isn't the only 
thing which happens: even if the game keeps running flawlessly and the 
system *seems* to work fine (I can drag windows, open the terminal...) lots 
of commands simply stall (for example mounting a partition, unmounting it, 
rebooting...). I can reliably reproduce it, it ALWAYS happens.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-13 Thread Niccolò Belli

On giovedì 12 maggio 2016 17:43:38 CEST, Austin S. Hemmelgarn wrote:
That's probably a good indication of the CPU and the MB being 
OK, but not necessarily the RAM.  There's two other possible 
options for testing the RAM that haven't been mentioned yet 
though (which I hadn't thought of myself until now):
1. If you have access to Windows, try the Windows Memory 
Diagnostic. This runs yet another slightly different set of 
tests from memtest86 and memtest86+, so it may catch issues they 
don't.  You can start this directly on an EFI system by loading 
/EFI/Microsoft/Boot/MEMTEST.EFI from the EFI system partition.
2. This is a Dell system.  If you still have the utility 
partition which Dell ships all their per-provisioned systems 
with, that should have a hardware diagnostics tool.  I doubt 
that this will find anything (it's part of their QA procedure 
AFAICT), but it's probably worth trying, as the memory testing 
in that uses yet another slightly different implementation of 
the typical tests.  You can usually find this in the boot 
interrupt menu accessed by hitting F12 before the boot-loader 
loads.


I tried the Dell System Test, including the enhanced optional ram tests and 
it was fine. I also tried the Microsoft one, which passed. BUT if I select 
the advanced test in the Microsoft One it always stops at 21% of first 
test. The test menus are still working, but fans get quiet and it keeps 
writing "test running... 21%" forever. I tried it many times and it always 
got stuck at 21%, so I suspect a test suite bug instead of a ram failure.


I also noticed some other interesting behaviours: while I was running the 
usual scrub+check (both were fine) from the livecd I noticed this in dmesg:
[  261.301159] BTRFS info (device dm-0): bdev /dev/mapper/cryptroot errs: 
wr 0, rd 0, flush 0, corrupt 4, gen 0
Corrupt? But both scrub and check were fine... I double checked scrub and 
check and they were still fine.


This is what happened another time: 
https://drive.google.com/open?id=0Bwe9Wtc-5xF1dGtPaWhTZ0w5aUU
I was making a backup of my partition USING DD from the livecd. It wasn't 
even mounted if I recall correctly!


On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote:

That's what a RAM corruption problem looks like when you run btrfs scrub.
Maybe the RAM itself is OK, but *something* is scribbling on it.

Does the Arch live usb use the same kernel as your normal system?


Yes, except for the point release (the system is slightly ahead of the 
liveusb).


On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote:

Did you try an older (or newer) kernel?  I've been running 4.5.x on a few
canary systems, but so far none of them have survived more than a day.


No (except for point releases from 4.5.0 to 4.5.4), but I will try 4.4.

On giovedì 12 maggio 2016 18:48:17 CEST, Zygo Blaxell wrote:

It's possible there's a problem that affects only very specific chipsets
You seem to have eliminated RAM in isolation, but there could be a problem
in the kernel that affects only your chipset.


Funny considering it is sold as a Linux laptop. Unfortunately they only 
tested it with the ancient Ubuntu 14.04.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo

2016-05-12 Thread Niccolò Belli
Thanks for the detailed explanation, hopefully in the future someone will 
be able to make defrag snapshot/reflink aware in a scalable manner.
I will not use use defrag anymore, but what do you suggest me to do to 
reclaim the lost space? Get rid of my current snapshots or maybe simply 
running bedup?


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-12 Thread Niccolò Belli

On lunedì 9 maggio 2016 18:29:41 CEST, Zygo Blaxell wrote:

Did you also check the data matches the backup?  btrfs check will only
look at the metadata, which is 0.1% of what you've copied.  From what
you've written, there should be a lot of errors in the data too.  If you
have incorrect data but btrfs scrub finds no incorrect checksums, then
your storage layer is probably fine and we have to look at CPU, host RAM,
and software as possible culprits.

The logs you've posted so far indicate that bad metadata (e.g. negative
item lengths, nonsense transids in metadata references but sane transids
in the referred pages) is getting into otherwise valid and well-formed
btrfs metadata pages.  Since these pages are protected by checksums,
the corruption can't be originating in the storage layer--if it was, the
pages should be rejected as they are read from disk, before btrfs even
looks at them, and the insane transid should be the "found" one not the
"expected" one.  That suggests there is either RAM corruption happening
_after_ the data is read from disk (i.e. while the pages are cached in
RAM), or a severe software bug in the kernel you're running.


When doing the btrfs check I also always do a btrfs scrub and it never 
found any error. Once it didn't manage to finish the scrub because of:
BTRFS critical (device dm-0): corrupt leaf, slot offset bad: 
block=670597120,root=1, slot=6

and btrfs scrub status reported "was aborted after 00:00:10".

Talking about scrub I created a systemd timer to run scrub hourly and I 
noticed 2 *uncorrectable* errors suddenly appeared on my system. So I 
immediately re-run the scrub just to confirm it and then I rebooted into 
the Arch live usb and runned btrfs check: the metadata were perfect. So I 
runned btrfs scrub from the live usb and there were no errors at all! I 
rebooted into my system and runned scrub once again and the uncorrectable 
errors where really gone! It happened two times in the past few days.



Try different kernel versions (e.g. 4.4.9 or 4.1.23) in case whoever
maintains your kernel had a bad day and merged a patch they should
not have.


Almost no patches get applied by the Arch kernel team: 
https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/linux
At the moment the only one is an harmless 
"change-default-console-loglevel.patch".



Try a minimal configuration with as few drivers as possible loaded,
especially GPU drivers and anything from the staging subdirectory--when
these drivers have bugs, they ruin everything.


Arch kernel team is quite conservative regarding staging/experimental 
features, I remember they rejected some config patches I submitted because 
of this.
Anyway I will try to blacklist as many kernel modules as I can. Maybe 
blacklisting GPU is too much because if I can't actually use my laptop it 
will be much more difficult to reproduce the issue.



Try memtest86+ which has a few more/different tests than memtest86.
I have encountered RAM modules that pass memtest86 but fail memtest86+
and vice versa.

Try memtester, a memory tester that runs as a Linux process, so it can
detect corruption caused when device drivers spray data randomly into RAM,
or when the CPU thermal controls are influenced by Linux (an overheating
CPU-to-RAM bridge can really ruin your day, and some of the dumber laptop
designs rely on the OS for thermal management).

Try running more than one memory testing process, in case there is a bug
in your hardware that affects interactions between multiple cores (memtest
is single-threaded).  You can run memtest86 inside a kvm (e.g. kvm
-m 3072 -kernel /boot/memtest86.bin) to detect these kinds of issues.

Kernel compiles are a bad way to test RAM.  I've successfully built
kernels on hosts with known RAM failures.  The kernels don't always work
properly, but it's quite rare to see a build fail outright.


I didn't use memtest86+ because of the lack of EFI support, but I just 
tried the shiny new memtest86 7.0 beta with improved tests for 12+ hours 
without issues.
Also I runned "memtester 4G" and "systester-cli -gausslg 64M -threads 4 
-turns 10" together for 12 hours without any issue so I think both my 
ram and cpu are ok.


I can think only about two possible culprits now (correct me if I'm wrong):
1) A btrfs bug
2) Another module screwing things around

I can do nothing about btrfs bugs so I will try to hunt the second option. 
This is the list of modules I'm running:


lsmod | awk '$4 == ""' | awk '{print $1}' | sort

8250_dw
ac
acpi_als
acpi_pad
aesni_intel
ahci
algif_skcipher
ansi_cprng
arc4
atkbd
battery
bnep
btrfs
btusb
cdc_ether
cmac
coretemp
crc32c_intel
crc32_pclmul
crct10dif_pclmul
dell_laptop
dell_wmi
dm_crypt
drbg
ecb
elan_i2c
evdev
ext4
fan
fjes
ghash_clmulni_intel
gpio_lynxpoint
hid_generic
hid_multitouch
hmac
i2c_designware_platform
i2c_hid
i2c_i801
i915
input_leds
int3400_thermal
int3402_thermal
int3403_thermal
intel_hid
intel_pch_thermal
intel_powerclamp
intel_rapl
ip_tables

Amount of scrubbed data goes from 15.90GiB to 26.66GiB after defragment -r -v -clzo on a fs always mounted with compress=lzo

2016-05-11 Thread Niccolò Belli

Hi,
Before doing the daily backup I did a btrfs check and btrfs scrub as usual. 
After that this time I also decided to run btrfs filesystem defragment -r 
-v -clzo on all subvolumes (from a live distro) and just to be sure I 
runned check and scrub once again.


Before defragment: total bytes scrubbed: 15.90GiB with 0 errors
After defragment: total bytes scrubbed: 26.66GiB with 0 errors

What did happen? This is something like a night and day difference: almost 
double the data! As stated in the subject all the subolumes have always 
been mounted with compress=lzo in /etc/fstab, even when I installed the 
distro a couple of days ago I manually mounted the subvolumes with -o 
compress=lzo. Instead I never used autodefrag.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-09 Thread Niccolò Belli

On domenica 8 maggio 2016 20:27:55 CEST, Patrik Lundquist wrote:

Are you using any power management tweaks?


Yes, as stated in my very first post I use TLP with 
SATA_LINKPWR_ON_BAT=max_performance, but I managed to reproduce the bug 
even without TLP. Also in the past week I've alwyas been on AC.


On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote:
Memtest doesn't replicate typical usage patterns very well.  My 
usual testing for RAM involves not just memtest, but also 
booting into a LiveCD (usually SystemRescueCD), pulling down a 
copy of the kernel source, and then running as many concurrent 
kernel builds as cores, each with as many make jobs as cores (so 
if you've got a quad core CPU (or a dual core with 
hyperthreading), it would be running 4 builds with -j4 passed to 
make).  GCC seems to have memory usage patterns that reliably 
trigger memory errors that aren't caught by memtest, so this 
generally gives good results.


Building kernel with 4 concurrent threads is not an issue for my system, in 
fact I do compile a lot and I never had any issue.


On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote:
On a similar note, badblocks doesn't replicate filesystem like 
access patterns, it just runs sequentially through the entire 
disk.  This isn't as likely to give bad results, but it's still 
important to know.  In particular, try running it over a dmcrypt 
volume a couple of times (preferably with a different key each 
time, pulling keys from /dev/urandom works well for this), as 
that will result in writing different data.  For what it's 
worth, when I'm doing initial testing of new disks, I always use 
ddrescue to copy /dev/zero over the whole disk, then do it twice 
through dmcrypt with different keys, copying from the disk to 
/dev/null after each pass.  This gives random data on disk as a 
starting point (which is good if you're going to use dmcrypt), 
and usually triggers reallocation of any bad sectors as early as 
possible.


While trying to find a common denominator for my issue I did lots of 
backups of /dev/mapper/cryptroot and I restored them into 
/dev/mapper/cryptroot dozens of times (triggering a 150GB+ random data 
write every time), without any issue (after restoring the backup I alwyas 
check the parition with btrfs check). So disk doesn't seem to be the 
culprit.


On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote:
1. If you have an eSATA port, try plugging your hard disk in 
there and see if things work.  If that works but having the hard 
drive plugged in internally doesn't, then the issue is probably 
either that specific SATA port (in which case your chip-set is 
bad and you should get a new system), or the SATA connector 
itself (or the wiring, but that's not as likely when it's traces 
on a PCB).  Normally I'd suggest just swapping cables and SATA 
ports, but that's not really possible with a laptop.
2. If you have access to a reasonably large flash drive, or to 
a USB to SATA adapter, try that as well, if it works on that but 
not internally (or on an eSATA port), you've probably got a bad 
SATA controller, and should get a new system.


My laptop doesn't have an eSATA port and my only big enough external drive 
is currently used for daily backups, since I fear for data loss.


On lunedì 9 maggio 2016 13:52:16 CEST, Austin S. Hemmelgarn wrote:
3. Try things without dmcrypt.  Adding extra layers makes it 
harder to determine what is actually wrong.  If it works without 
dmcrypt, try using different parameters for the encryption 
(different ciphers is what I would try first).  If it works 
reliably without dmcrypt, then it's either a bug in dmcrypt 
(which I don't think is very likely), or it's bad interaction 
between dmcrypt and BTRFS.  If it works with some encryption 
parameters but not others, then that will help narrow down where 
the issue is.


On domenica 8 maggio 2016 01:35:16 CEST, Chris Murphy wrote:

You're making the troubleshooting unnecessarily difficult by
continuing to use non-default options. *shrug*

Every single layer you add complicates the setup and troubleshooting.
Of course all of it should work together, many people do. But you're
the one having the problem so in order to demonstrate whether this is
a software bug or hardware problem, you need to test it with the most
basic setup possible --> btrfs on plain partitions and default mount
options.


I will try to recap because you obviously missed my previous e-mail: I 
managed to replicate the irrecoverable corruption bug even with default 
options and no dmcrypt at all. Somehow it was a bit more difficult to 
replicate with default options and so I started to play with different 
combinations to find if there was something which increased the chances of 
getting corruption. I have the feeling that "autodefrag" enhances the 
chances to get corruption, but I'm not 100% sure about it. Anyway, 
triggering a whole packages reinstall with "pacaur -S $(pacman 

Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-07 Thread Niccolò Belli

Il 2016-05-07 17:58 Clemens Eisserer ha scritto:

Hi Niccolo,


btrfs + dmcrypt + compress=lzo + autodefrag = corruption at first boot


Just to be curious - couldn't it be a hardware issue? I use almost the
same setup (compress-force=lzo instead of compress-force=lzo) on my
laptop for 2-3 years and haven't experienced any issues since
~kernel-3.14 or so.

Br, Clemens Eisserer


Hi,
Which kind of hardware issue? I did a full memtest86 check, a full 
smartmontools extended check and even a badblocks -wsv.
If this is really an hardware issue that we can identify I would be more 
than happy because Dell will replace my laptop and this nightmare will 
be finally over. I'm open to suggestions.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-07 Thread Niccolò Belli

btrfs + dmcrypt + compress=lzo + autodefrag = corruption at first boot
So discard is not the culprit. Will try to remove compress=lzo and 
autodefrag and see if it still happens.


[  748.224346] BTRFS error (device dm-0): memmove bogus src_offset 5431 
move len 4294962894 len 16384

[  748.226206] [ cut here ]
[  748.227831] kernel BUG at fs/btrfs/extent_io.c:5723!
[  748.229498] invalid opcode:  [#1] PREEMPT SMP
[  748.231161] Modules linked in: ext4 mbcache jbd2 nls_iso8859_1 
nls_cp437 vfat fat snd_hda_codec_hdmi dell_laptop dcdbas dell_wmi 
iTCO_wdt iTCO_vendor_support intel_rapl x86_pkg_temp_thermal 
intel_powerclamp coretemp kvm_intel arc4 kvm irqbypass psmouse serio_raw 
pcspkr elan_i2c snd_soc_ssm4567 snd_soc_rt286 snd_soc_rl6347a 
snd_soc_core i2c_hid iwlmvm snd_compress snd_pcm_dmaengine ac97_bus 
mac80211 uvcvideo videobuf2_vmalloc btusb videobuf2_memops cdc_ether 
btrtl usbnet iwlwifi btbcm videobuf2_v4l2 btintel intel_pch_thermal 
videobuf2_core i2c_i801 videodev r8152 rtsx_pci_ms cfg80211 bluetooth 
visor media mii memstick joydev evdev mousedev input_leds rfkill mac_hid 
crc16 i915 fan thermal wmi dw_dmac int3403_thermal video dw_dmac_core 
drm_kms_helper snd_soc_sst_acpi i2c_designware_platform 
snd_soc_sst_match
[  748.237203]  snd_hda_intel 8250_dw i2c_designware_core gpio_lynxpoint 
spi_pxa2xx_platform drm int3402_thermal snd_hda_codec battery tpm_crb 
intel_hid snd_hda_core sparse_keymap fjes snd_hwdep int3400_thermal 
acpi_thermal_rel tpm_tis snd_pcm intel_gtt tpm acpi_als syscopyarea 
sysfillrect snd_timer sysimgblt fb_sys_fops mei_me i2c_algo_bit 
processor_thermal_device kfifo_buf processor snd industrialio acpi_pad 
ac int340x_thermal_zone mei intel_soc_dts_iosf button lpc_ich soundcore 
shpchp sch_fq_codel ip_tables x_tables btrfs xor raid6_pq 
jitterentropy_rng sha256_ssse3 sha256_generic hmac drbg ansi_cprng 
algif_skcipher af_alg uas usb_storage dm_crypt dm_mod sd_mod 
rtsx_pci_sdmmc atkbd libps2 crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper
[  748.244176]  ablk_helper cryptd ahci libahci libata scsi_mod xhci_pci 
rtsx_pci xhci_hcd i8042 serio sdhci_acpi sdhci led_class mmc_core pl2303 
mos7720 usbserial parport hid_generic usbhid hid usbcore usb_common

[  748.246662] CPU: 0 PID: 2316 Comm: pacman Not tainted 4.5.1-1-ARCH #1
[  748.249123] Hardware name: Dell Inc. XPS 13 9343/0F5KF3, BIOS A07 
11/11/2015
[  748.251576] task: 8800d9d98e40 ti: 8800cec1 task.ti: 
8800cec1
[  748.254064] RIP: 0010:[]  [] 
memmove_extent_buffer+0x10c/0x110 [btrfs]

[  748.256600] RSP: 0018:8800cec13c18  EFLAGS: 00010246
[  748.259120] RAX:  RBX: 88020c01ba40 RCX: 
0056
[  748.261631] RDX:  RSI: 88021e40db38 RDI: 
88021e40db38
[  748.264166] RBP: 8800cec13c48 R08:  R09: 
033b
[  748.266716] R10:  R11: 033b R12: 
eece
[  748.269267] R13: 00010405 R14: 000104c9 R15: 
88020c01ba40
[  748.271818] FS:  7f14d4271740() GS:88021e40() 
knlGS:

[  748.274392] CS:  0010 DS:  ES:  CR0: 80050033
[  748.276987] CR2: 01630008 CR3: cffc8000 CR4: 
003406f0
[  748.279603] DR0:  DR1:  DR2: 

[  748.282220] DR3:  DR6: fffe0ff0 DR7: 
0400

[  748.284815] Stack:
[  748.287422]  e3438cd2 88020c01ba40 00c4 
002a
[  748.290082]  006b 03a0 8800cec13ce8 
a02b612c
[  748.292754]  a02b433d 8800da9ca820 0028 
8800daa78bd0

[  748.295441] Call Trace:
[  748.298104]  [] btrfs_del_items+0x33c/0x4a0 [btrfs]
[  748.300827]  [] ? btrfs_search_slot+0x90d/0x990 
[btrfs]
[  748.303564]  [] ? btrfs_get_token_8+0x6c/0x130 
[btrfs]
[  748.306311]  [] 
btrfs_truncate_inode_items+0x649/0xd20 [btrfs]
[  748.309071]  [] ? 
btrfs_delayed_inode_release_metadata.isra.1+0x4e/0xf0 [btrfs]
[  748.311860]  [] btrfs_evict_inode+0x485/0x5d0 
[btrfs]

[  748.314627]  [] evict+0xc5/0x190
[  748.317412]  [] iput+0x1d9/0x260
[  748.320199]  [] do_unlinkat+0x199/0x2d0
[  748.322988]  [] SyS_unlink+0x16/0x20
[  748.325781]  [] entry_SYSCALL_64_fastpath+0x12/0x6d
[  748.328584] Code: 41 5e 41 5f 5d c3 48 8b 7f 18 48 89 f2 48 c7 c6 40 
44 36 a0 e8 06 90 fa ff 0f 0b 48 8b 7f 18 48 c7 c6 08 44 36 a0 e8 f4 8f 
fa ff <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 89 fb
[  748.331558] RIP  [] 
memmove_extent_buffer+0x10c/0x110 [btrfs]

[  748.334473]  RSP 
[  748.356077] ---[ end trace 9bfb28800ab52273 ]---
[  748.359042] note: pacman[2316] exited with preempt_count 2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: How to rollback a snapshot of a subvolume with nested subvolumes?

2016-05-07 Thread Niccolò Belli

Thanks,
I hoped there was something like an hidden recursive flag to avoid the 
tedious task of snapshotting all the nested subvolumes or deleting the 
nested ones, but it seems there isn't.
I usually don't want to keep things like /var/cache/pacman/pkg, but since 
I'm just doing some tests I didn't want to lose my packages cache.
Regarding @/.snapshots it was an unfortunate choise made by snapper and I 
will definitely create it into the top level, like /srv which shouldn't 
belong to @.


By the way snapper rollbacks are yet another reason to not keep subvolid 
along with subvol=@ into fstab, like in the one automatically generated by 
genfstab.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to rollback a snapshot of a subvolume with nested subvolumes?

2016-05-06 Thread Niccolò Belli

The following are my subvolumes:

$ sudo btrfs subvol list /
[sudo] password di niko: 
ID 257 gen 1040 top level 5 path @

ID 258 gen 1040 top level 5 path @home
ID 270 gen 889 top level 257 path var/cache/pacman/pkg
ID 271 gen 15 top level 257 path var/abs
ID 272 gen 972 top level 257 path var/tmp
ID 273 gen 37 top level 257 path tmp
ID 274 gen 20 top level 257 path srv
ID 276 gen 25 top level 258 path @home/niko/.cache/pacaur
ID 280 gen 993 top level 257 path .snapshots
ID 281 gen 993 top level 258 path @home/.snapshots
ID 282 gen 169 top level 280 path .snapshots/1/snapshot
ID 283 gen 171 top level 280 path .snapshots/2/snapshot
ID 284 gen 173 top level 280 path .snapshots/3/snapshot
ID 285 gen 124 top level 281 path @home/.snapshots/1/snapshot
ID 286 gen 175 top level 280 path .snapshots/4/snapshot
ID 288 gen 177 top level 280 path .snapshots/5/snapshot
ID 290 gen 237 top level 280 path .snapshots/6/snapshot
ID 291 gen 238 top level 281 path @home/.snapshots/2/snapshot
ID 292 gen 308 top level 280 path .snapshots/7/snapshot
ID 293 gen 309 top level 281 path @home/.snapshots/3/snapshot
ID 294 gen 376 top level 280 path .snapshots/8/snapshot
ID 295 gen 377 top level 281 path @home/.snapshots/4/snapshot   

ID 296 gen 442 top level 280 path .snapshots/9/snapshot 

ID 297 gen 443 top level 281 path @home/.snapshots/5/snapshot   

ID 298 gen 511 top level 280 path .snapshots/10/snapshot

ID 299 gen 512 top level 281 path @home/.snapshots/6/snapshot   

ID 300 gen 578 top level 280 path .snapshots/11/snapshot

ID 301 gen 579 top level 281 path @home/.snapshots/7/snapshot   

ID 302 gen 648 top level 280 path .snapshots/12/snapshot

ID 303 gen 649 top level 281 path @home/.snapshots/8/snapshot   

ID 304 gen 716 top level 280 path .snapshots/13/snapshot

ID 305 gen 717 top level 281 path @home/.snapshots/9/snapshot   

ID 306 gen 967 top level 280 path .snapshots/14/snapshot

ID 307 gen 789 top level 281 path @home/.snapshots/10/snapshot  

ID 309 gen 834 top level 280 path .snapshots/15/snapshot

ID 310 gen 874 top level 280 path .snapshots/16/snapshot

ID 311 gen 875 top level 281 path @home/.snapshots/11/snapshot  

ID 312 gen 887 top level 280 path .snapshots/17/snapshot
   
   

ID 313 gen 888 top level 280 path .snapshots/18/snapshot
   
   

ID 314 gen 904 top level 280 path .snapshots/19/snapshot
   
   

ID 316 gen 938 top level 280 path .snapshots/20/snapshot
   
   

ID 317 gen 939 top level 281 path @home/.snapshots/12/snapshot  
   
   

ID 318 gen 991 top level 280 path .snapshots/21/snapshot
   
   

ID 319 gen 992 top level 281 path @home/.snapshots/13/snapshot


I would like to rollback to .snapshots/14/snapshot, restoring my @ 
subvolume to such a previous state.

So I booted into a livecd, mounted my disk into /mnt and typed:
btrfs subvol snapshot /mnt/@/.snapshots/14/snapshot /mnt/@


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-06 Thread Niccolò Belli
I formatted the partition and copied the content of my previous rootfs to 
it. There is no dmcrypt now and mount options are defaults, except for 
noatime. After a single boot I got the very same problem as before (fs 
corrupted and an infinite loop when doing btrfs check --repair.


I wanted to replicate results and so I tried once again and since then I 
only experienced minor corruption, correctly resolved by repair. But during 
a pacaman upgrade, which triggered snapper pre-post snapshots, the system 
hanged and I found this in the logs:


mag 06 10:31:15 arch-laptop plasmashell[873]: requesting unexisting screen 
2
mag 06 10:31:18 arch-laptop dbus[418]: [system] Activating service 
name='org.opensuse.Snapper' (using servicehelper)
mag 06 10:31:18 arch-laptop dbus[418]: [system] Successfully activated 
service 'org.opensuse.Snapper'

mag 06 10:31:20 arch-laptop kernel: [ cut here ]
mag 06 10:31:20 arch-laptop kernel: kernel BUG at fs/btrfs/ctree.h:2693!

Still no major corruption found since my second attempt.

Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-05 Thread Niccolò Belli

On giovedì 5 maggio 2016 03:07:37 CEST, Chris Murphy wrote:

I suggest using defaults for starters. The only thing in that list
that needs be there is either subvolid or subvold, not both. Add in
the non-default options once you've proven the defaults are working,
and add them one at a time.


Yes I read your previous suggestion and I already dropped subvolid, but 
since the problem already happened I left it in the mail for completeness.
Anyway the culprit here is genfstab and that's probably what a beginner is 
going to use when installing a distro: 
https://wiki.archlinux.org/index.php/beginners'_guide#fstab



Disk is a SAMSUNG SSD PM851 M.2 2280 256GB (Firmware Version: EXT25D0Q).


The firmware is old if I understand the naming scheme used by Dell. It
says EXT49D0Q is current.

http://www.dell.com/support/home/al/en/aldhs1/Drivers/DriversDetails?driverId=0NXHH


According to this 
(http://forum.notebookreview.com/threads/2015-xps-13-ssd-fw-problem-with-m-2-samsung-pm851.770501/) 
the firmware you linked is for the mSATA version of the drive, not the M.2 
one. EXT25D0Q seems to be the very latest one for my drive.



I advice using all defaults for everything for
now, otherwise it's anyone's guess what you're running into.


On giovedì 5 maggio 2016 06:12:28 CEST, Qu Wenruo wrote:
Would it be OK for you to test your btrfs on a plain ssd, 
without encryption?
And just as Chris Murphy said, reducing mount option is also a 
pretty good debugging start point.


Ok, I will remove dmcrypt, discard, compress=lzo, nodefrag and see what 
happens.



I made a copy of /dev/mapper/cryptroot with dd on an external drive and
I run btrfs check on it (btrfs-progs 4.5.2):
https://drive.google.com/open?id=0Bwe9Wtc-5xF1SjJacXpMMU5mems (37MB)


Checked, but seems the output is truncated?


No, I didn't truncate the btrfs check output because it wasn't endless. I 
just truncated the repair output.


I also have something new to report. Do you remember when I said that my 
screen was black and so I had to forcedly power off the system? Something 
similar happened today and since in the meantime I enabled magic sysrq keys 
I have been able to recover this from the logs:


mag 05 11:55:51 arch-laptop kdeinit5[960]: Registering 
"org.kde.StatusNotifierItem-1060-1/StatusNotifierItem" to system tray

mag 05 11:55:51 arch-laptop obexd[1098]: OBEX daemon 5.39
mag 05 11:55:51 arch-laptop dbus-daemon[920]: Successfully activated 
service 'org.bluez.obex'

mag 05 11:55:51 arch-laptop systemd[898]: Started Bluetooth OBEX service.
mag 05 11:55:51 arch-laptop korgac[1044]: log_kidentitymanagement: 
IdentityManager: There was no default identity. Marking first one as 
default.
mag 05 11:55:51 arch-laptop kernel: BUG: unable to handle kernel paging 
request at 00017d11
mag 05 11:55:51 arch-laptop kernel: IP: [] 
anon_vma_interval_tree_insert+0x3f/0x90
mag 05 11:55:51 arch-laptop kernel: PGD 0 
mag 05 11:55:51 arch-laptop kernel: Oops:  [#1] PREEMPT SMP 
mag 05 11:55:51 arch-laptop kernel: Modules linked in: rfcomm(+) visor bnep 
uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core 
videodev media btusb btrtl btbcm btintel cdc_ether bluetooth usbnet r8152 
crc16 mii joydev mousedev nvr
mag 05 11:55:51 arch-laptop kernel:  mei_me syscopyarea sysfillrect snd 
sysimgblt fb_sys_fops i2c_algo_bit shpchp soundcore mei wmi thermal fan 
intel_hid sparse_keymap int3403_thermal video processor_thermal_device 
dw_dmac snd_soc_sst_acpi snd_soc_sst_m
mag 05 11:55:51 arch-laptop kernel:  lrw gf128mul glue_helper ablk_helper 
cryptd ahci libahci libata scsi_mod xhci_pci rtsx_pci

mag 05 11:55:51 arch-laptop kernel: Bluetooth: RFCOMM TTY layer initialized
mag 05 11:55:51 arch-laptop kernel: Bluetooth: RFCOMM socket layer 
initialized

mag 05 11:55:51 arch-laptop kernel: Bluetooth: RFCOMM ver 1.11
mag 05 11:55:51 arch-laptop kernel:  xhci_hcd
mag 05 11:55:51 arch-laptop kernel:  i8042 serio sdhci_acpi sdhci led_class 
mmc_core pl2303 mos7720 usbserial parport hid_generic usbhid hid usbcore 
usb_common
mag 05 11:55:51 arch-laptop kernel: CPU: 0 PID: 351 Comm: systemd-udevd Not 
tainted 4.5.1-1-ARCH #1
mag 05 11:55:51 arch-laptop kernel: Hardware name: Dell Inc. XPS 13 
9343/0F5KF3, BIOS A07 11/11/2015
mag 05 11:55:51 arch-laptop kernel: task: 88021347d580 ti: 
880211f8c000 task.ti: 880211f8c000
mag 05 11:55:51 arch-laptop kernel: RIP: 0010:[]  
[] anon_vma_interval_tree_insert+0x3f/0x90
mag 05 11:55:51 arch-laptop kernel: RSP: 0018:880211f8fd68  EFLAGS: 
00010206
mag 05 11:55:51 arch-laptop kernel: RAX: 8800da2f4820 RBX: 
8800bb59ce40 RCX: 8800da2f4830
mag 05 11:55:51 arch-laptop kernel: RDX: 8800da2f4828 RSI: 
8800374404a0 RDI: 8800c58dfa40
mag 05 11:55:51 arch-laptop kernel: RBP: 880211f8fdb8 R08: 
00017c79 R09: 0007f55e2059
mag 05 11:55:51 arch-laptop kernel: R10: 0007f55e2053 R11: 
8800c58dfa40 R12: 880037440460
mag 05 11:55:51 arch-laptop kernel: 

btrfs ate my data in just two days, after a fresh install. ram and disk are ok. it still mounts, but I cannot repair

2016-05-04 Thread Niccolò Belli
I really need your help, because it's the second time btrfs ate my data in 
a couple of days and I can't use my laptop if I don't find the culprit.


This was the mail I sent a couple of days ago: 
https://www.spinics.net/lists/linux-btrfs/msg54754.html
I previously thought the culprit was a bug in kernel 4.6-rc, but I was 
wrong.


Then I reinstalled the whole system (Arch Linux) from scratch, and after 
just two days I lost some of my data, again. Once again btrfs check 
--repair got stuck in an infinite loop and I can't repair my fs. The system 
has always been shutdown properly, except for a single time when I had to 
forcedly power it off just after the boot because I didn't see any signal 
on the screen.


First the obvious things:

- memory is ok 
(https://drive.google.com/open?id=0Bwe9Wtc-5xF1VnJ0SE9fT1FZMTg)
- disk is ok 
(https://drive.google.com/open?id=0Bwe9Wtc-5xF1NGRhd2daVDRJVGc)
- tlp has SATA_LINKPWR_ON_BAT=max_performance 
(https://drive.google.com/open?id=0Bwe9Wtc-5xF1dFAwUE5ETVpNWGM)
- rootfs mount options: 
rw,noatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@
- Command line: BOOT_IMAGE=/@/boot/vmlinuz-linux 
root=UUID=4fc2278e-f6e8-4a21-8876-cabbf885bb2e rw rootflags=subvol=@ 
cryptdevice=/dev/disk/by-uuid/c7c8f501-507c-4bd2-a80a-8c7360651f02:cryptroot:allow-discards 
quiet

- scrub didn't find any error:
$ sudo btrfs scrub status /
scrub status for 4fc2278e-f6e8-4a21-8876-cabbf885bb2e
   scrub started at Thu May  5 00:57:30 2016 and finished after 
00:00:45

   total bytes scrubbed: 22.26GiB with 0 errors

I have the whole rootfs encrypted, including boot. I followed these steps: 
https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_an_entire_system#Btrfs_subvolumes_with_swap


Disk is a SAMSUNG SSD PM851 M.2 2280 256GB (Firmware Version: EXT25D0Q).
Laptop is a Dell XPS 13 9343 QHD+.
Distro is Arch Linux, kernel version is 4.5.1. btrfs-progs is 4.5.2.

After two days from the previous data loss I finished reinstalling my 
distro from scratch, then I decided to do a full backup from a snapshot 
using tar. This is what I got while trying to backup my data:


tar: usr/share/kig/icons/hicolor/32x32/actions/test.png: errore di lettura 
al byte 0 leggendo 810 byte: Errore di input/output
tar: usr/share/kig/icons/hicolor/32x32/actions/circlebpd.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/pointOnLine.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/bezierN.png: funzione "stat" 
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/convexhull.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/centerofcurvature.png: 
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/en.png: funzione "stat" non 
riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/circlebps.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/directrix.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/beziercurves.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/segment_midpoint.png: 
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/distance.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/circlebcl.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/conicb5p.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/kig_polygon.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/conicasymptotes.png: 
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/pointxy.png: funzione "stat" 
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/attacher.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/coniclineintersection.png: 
funzione "stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/vectorsum.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/rbezier4.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/ellipsebffp.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/angle.png: funzione "stat" 
non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/kig_text.png: funzione 
"stat" non riuscita: Stale file handle
tar: usr/share/kig/icons/hicolor/32x32/actions/vectordifference.png: 
funzione "stat" non riuscita: Stale file handle
tar: 

Re: /etc/fstab rootfs options vs grub2 rootflags cmdline

2016-05-04 Thread Niccolò Belli

Thanks,
Now my fstab option are 
rw,noatime,compress=lzo,discard,autodefrag,subvolid=257,subvol=/@
I tried to add rootflags=noatime,compress=lzo,discard,autodefrag to 
GRUB_CMDLINE_LINUX in /etc/default/grub as you suggested but my system 
didn't manage to boot, probably because grub automatically adds 
rootflags=subvol=@ and only a single rootflags can be taken into account. 
Do you have any suggestion?


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


/etc/fstab rootfs options vs grub2 rootflags cmdline

2016-05-04 Thread Niccolò Belli

Hi,
I have the following options for my rootfs in /etc/fstab:
rw,relatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@

grub2 already placed rootflags=subvol=@ in its cmdline, but not the other 
options. I suppose that some of them will automatically be set during 
remount, but I'm not sure if all of them will.


Do you know which ones should I manually add to GRUB_CMDLINE_LINUX in 
/etc/default/grub?


Is there any way to check to if they are already enabled?
mount shows /dev/mapper/cryptroot on / type btrfs 
(rw,relatime,compress=lzo,ssd,discard,space_cache,autodefrag,subvolid=257,subvol=/@) 
but I'm not sure if I can trust it: I read that space_cache should trigger 
"enabling free space tree" in dmesg but I can't see it and I don't know 
about the others.


Thanks,
Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Uncorrectable errors after rebooting with Magic Sysrq Keys

2016-04-30 Thread Niccolò Belli
Finally my external drive arrived and I've been able to make a backup and 
try btrfs check --repair.
Unfortunately btrfs check --repair got stuck in an infinite loop like this 
one (https://www.spinics.net/lists/linux-btrfs/msg54146.html) and after 
several hours of looping and several Gigabytes of logs I had to kill it, 
which gave me a completely fucked fs.
I still have backup images, so I can restore the old state and try again 
with updated tools (I used latest btrfs-progs 4.5.1, but I also tried 
4.4.1).
For those who didn't read the whole thread I can mount the fs, but it hangs 
while trying to read certain files and sometimes it remounts read-only. I'm 
pretty sure the culprit was a bug in 4.6-rc because problems started 
roughly after upgrading. Disk (an SSD) is fine. The fs is on top of 
dm-crypt and I always mounted it with 
"rw,relatime,ssd,space_cache,discard,compress=lzo,autodefrag".


You can find the whole logs here: 
https://drive.google.com/open?id=0Bwe9Wtc-5xF1Z2YwN1Y4U0ROSUU


01_scrub is the scrub output
02_check is the btrfs check output (14MB)
03_repair_short is the btrfs check --repair output truncated to 14MB

I hope someone will be able to help me recover my data, otherwise I will 
have to backup just the most important files and reinstall the whole system 
from scratch. Mounting the fs and doing a backup with cp -a wasn't a viable 
solution because it got stuck after several GBs.


Niccolò

P.S.
I changed my spf/dkim/dmarc settings, this email should no longer go into 
the spam folder, if it does please let me know. Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Uncorrectable errors after rebooting with Magic Sysrq Keys

2016-04-16 Thread Niccolò Belli
I finally run a btrfs check --readonly on my fs, sorry if it took so long 
but it complained about the fs being mounted even if it was readonly, so I 
had to download a Fedora 24 alpha livecd to be able to run it.
Here it is (8.5MB): 
https://drive.google.com/open?id=0Bwe9Wtc-5xF1blJGMTNHaDdUQjg


In the meantime, since I suspected it may be a 4.6 regression, I switched 
back to 4.5.


P.S.
Scrub's uncorrectable errors went down from 10 to 4 by itself, without any 
apparent reason.


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Uncorrectable errors after rebooting with Magic Sysrq Keys

2016-04-15 Thread Niccolò Belli

https://bpaste.net/show/df9cc097c1da

This fs is *completely* FUCKED. Can't wait to get my hands on the external 
drive to be able to make a full backup.
Is it possible it is a kernel 4.6 regression? I had problems before, but 
nothing like this :(


Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Uncorrectable errors after rebooting with Magic Sysrq Keys

2016-04-15 Thread Niccolò Belli

Hi,
Is it 100% safe to run a btrfs check without --repair?
Because otherwise I will have to wait for my new external drive to arrive 
and make a backup first.


Thanks,
Niccolò

On venerdì 15 aprile 2016 11:30:32 CEST, Qu Wenruo wrote:
Would you please run "btrfs check --readonly " and 
paste the output?


The dmesg seems very impossible:


BTRFS error (device dm-0): bad tree block start 245497856 245498111


The later one is not even aligned to 2.

But you system still seems mountable as you succeeded in 
running btrfs scrub.


So I assume either the tree block is not a critical one or the 
copy saved you.


Thanks,
Qu

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Uncorrectable errors after rebooting with Magic Sysrq Keys

2016-04-15 Thread Niccolò Belli

Hi,
Unfortunately because of buggy upstream support for my hardware (Dell XPS 
13 9343) I often have to force reboot using Magic Sysrq Keys (REISUB). In 
fact I have quite a few hangs, also the majority of times I am not able to 
shutdown without relying on REISUB. There are obviously times when even 
REISUB do not work (kernel is completely unresponsive), but the vast 
majority of times it works. What I do not understand is why Magic Sysrq 
Keys leave me with a damaged filesystem: shouldn't an emergency SYNC + read 
only remount be enough to secure my data? After rebooting with REISUB my 
system often complains about "read only" files and if I "stat" them I get 
"weird file". I often loose some of my desktop settings like the plasmoids 
I had on the desktop or my favourite applicatios I had in the menu, but 
what's even stranger is that I often magically recover them later, while 
doing exactly NOTHING to recover them. This behaviour scares me so much 
that I'm thinking about switching to another fs if I will not find a 
solution very soon.


The disk seems fine: https://bpaste.net/show/822d4b4ff902

dmesg: http://paste.pound-python.org/show/wVyHXXOw4emWmWFfVJHQ/

$ sudo btrfs scrub status /
[sudo] password di niko: 
scrub status for 28443ff1-5325-45f6-b879-dad895fcdcfb
   scrub started at Fri Apr 15 09:38:09 2016 and finished after 
00:08:41

   total bytes scrubbed: 133.94GiB with 10 errors
   error details: csum=10
   corrected errors: 0, uncorrectable errors: 10, unverified errors: 0

(yesterday there were 4 uncorrectable errors, but after today's reboot with 
Magic Sysrq Keys it is now 10)


Distro is Arch Linux, kernel is 4.6.0-rc3.
$ btrfs --version
btrfs-progs v4.4.1

Greetings,
Niccolò
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html