Re: [PATCH v2] btrfs-progs: du: fix to skip not btrfs dir/file

2016-07-08 Thread Andrei Borzenkov
07.07.2016 12:43, Wang Shilong пишет:
> 'btrfs file du' is a very useful tool to watch my system
> file usage information with snapshot aware.
> 
> when trying to run following commands:
> [root@localhost btrfs-progs]# btrfs file du /
>  Total   Exclusive  Set shared  Filename
> ERROR: Failed to lookup root id - Inappropriate ioctl for device
> ERROR: cannot check space of '/': Unknown error -1
> 
> and My Filesystem looks like this:
> [root@localhost btrfs-progs]# df -Th
> Filesystem Type  Size  Used Avail Use% Mounted on
> devtmpfs   devtmpfs   16G 0   16G   0% /dev
> tmpfs  tmpfs  16G  368K   16G   1% /dev/shm
> tmpfs  tmpfs  16G  1.4M   16G   1% /run
> tmpfs  tmpfs  16G 0   16G   0% /sys/fs/cgroup
> /dev/sda3  btrfs  60G   19G   40G  33% /
> tmpfs  tmpfs  16G  332K   16G   1% /tmp
> /dev/sdc   btrfs 2.8T  166G  1.7T   9% /data
> /dev/sda2  xfs   2.0G  452M  1.6G  23% /boot
> /dev/sda1  vfat  1.9G   11M  1.9G   1% /boot/efi
> tmpfs  tmpfs 3.2G   24K  3.2G   1% /run/user/1000
> 
> So I installed Btrfs as my root partition, but boot partition
> can be other fs.
> 
> We can Let btrfs tool aware of this is not a btrfs file or
> directory and skip those files, so that someone like me
> could just run 'btrfs file du /' to scan all btrfs filesystems.
> 
> After patch, it will look like:
>Total   Exclusive  Set shared  Filename
>  0.00B   0.00B   -  //root/.bash_logout
>  0.00B   0.00B   -  //root/.bash_profile
>  0.00B   0.00B   -  //root/.bashrc
>  0.00B   0.00B   -  //root/.cshrc
>  0.00B   0.00B   -  //root/.tcshrc
> 

Can you avoid double slashes?


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-08 Thread Andrei Borzenkov
08.07.2016 04:24, Duncan пишет:
> Corey Coughlin posted on Wed, 06 Jul 2016 23:40:30 -0700 as excerpted:
> 
>> Well yeah, if I was mounting all the disks to different mount points, I
>> would definitely use UUIDs to get them mounted.  But I haven't seen any
>> way to set up a "mkfs.btrfs" command to use UUID or anything else for
>> individual drives.  Am I missing something?  I've been doing a lot of
>> googling.
> 
> FWIW, you can use the /dev/disk/by-*/* symlinks (as normally setup by 
> udev) to reference various devices.
> 

Current udev ships rule that calls equivalent of "btrfs device ready
$dev", where $dev is the canonical kernel device name. btrfs kernel
driver will update internal list of device names when it gets this
ioctl, which means that unless you explicitly pass full list of
/dev/disk/by-*/* during mount you will see those kernel names.

And even then as soon as device for some reason dis- and re-appears (as
is apparently the case here) it will be renamed back by udev when "add"
event is seen.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-08 Thread Andrei Borzenkov
07.07.2016 09:40, Corey Coughlin пишет:
> Hi Tomasz,
> Thanks for the response!  I should clear some things up, though.
> 
> On 07/06/2016 03:59 PM, Tomasz Kusmierz wrote:
>>> On 6 Jul 2016, at 23:14, Corey Coughlin
>>>  wrote:
>>>
>>> Hi all,
>>> Hoping you all can help, have a strange problem, think I know
>>> what's going on, but could use some verification.  I set up a raid1
>>> type btrfs filesystem on an Ubuntu 16.04 system, here's what it looks
>>> like:
>>>
>>> btrfs fi show
>>> Label: none  uuid: 597ee185-36ac-4b68-8961-d4adc13f95d4
>>> Total devices 10 FS bytes used 3.42TiB
>>> devid1 size 1.82TiB used 1.18TiB path /dev/sdd
>>> devid2 size 698.64GiB used 47.00GiB path /dev/sdk
>>> devid3 size 931.51GiB used 280.03GiB path /dev/sdm
>>> devid4 size 931.51GiB used 280.00GiB path /dev/sdl
>>> devid5 size 1.82TiB used 1.17TiB path /dev/sdi
>>> devid6 size 1.82TiB used 823.03GiB path /dev/sdj
>>> devid7 size 698.64GiB used 47.00GiB path /dev/sdg
>>> devid8 size 1.82TiB used 1.18TiB path /dev/sda
>>> devid9 size 1.82TiB used 1.18TiB path /dev/sdb
>>> devid   10 size 1.36TiB used 745.03GiB path /dev/sdh
> Now when I say that the drives mount points change, I'm not saying they
> change when I reboot.  They change while the system is running.  For
> instance, here's the fi show after I ran a "check --repair" run this
> afternoon:
> 
> btrfs fi show
> Label: none  uuid: 597ee185-36ac-4b68-8961-d4adc13f95d4
> Total devices 10 FS bytes used 3.42TiB
> devid1 size 1.82TiB used 1.18TiB path /dev/sdd
> devid2 size 698.64GiB used 47.00GiB path /dev/sdk
> devid3 size 931.51GiB used 280.03GiB path /dev/sdm
> devid4 size 931.51GiB used 280.00GiB path /dev/sdl
> devid5 size 1.82TiB used 1.17TiB path /dev/sdi
> devid6 size 1.82TiB used 823.03GiB path /dev/sds
> devid7 size 698.64GiB used 47.00GiB path /dev/sdg
> devid8 size 1.82TiB used 1.18TiB path /dev/sda
> devid9 size 1.82TiB used 1.18TiB path /dev/sdb
> devid   10 size 1.36TiB used 745.03GiB path /dev/sdh
> 
> Notice that /dev/sdj in the previous run changed to /dev/sds.  There was
> no reboot, the mount just changed.  I don't know why that is happening,
> but it seems like the majority of the errors are on that drive.  But
> given that I've fixed the start/stop issue on that disk, it probably
> isn't a WD Green issue.

It's not "mount point", it is just device names. Do not make it sound
more confusing than it already is :)

This implies that disks drop off and reappear. Do you have "dmesg" or
log (/var/log/syslog or /var/log/messages or journalctl) for the same
period of time?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Andrei Borzenkov
09.07.2016 00:50, Chris Murphy пишет:
>>
>> Instead those utilities should employ rootflags=subvol or subvolid to
>> explicitly use a particular fs tree for rootfs, rather that hide this
>> fact by using subvolume set-default.
> 
> The only distro installer I know that works this way out of the box is
> Fedora/Red Hat's Anaconda. It leaves the default subvolume as 5, but
> does not install the OS there. Instead each mountpoint is created as a
> subvolume in that top level, and rootflags kernel parameter and fstab
> are used to assemble those subvolumes per the FHS virtually. It's
> completely discoverable, you can follow each step along the way, it's
> not obscured.
> 
> The additional benefit is no nested subvolumes.
> 

Does it use grub2? Where /boot/grub is located - on one of those
snapshots or on partition outside of btrfs control?

Does it support booting from previous read-only snapshot directly and/or
rollback to previous snapshot?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-08 Thread Corey Coughlin

Hi all,
One thing I may not have made clear, this wasn't a system that has 
been running for a month and just became up corrupt out of nowhere, the 
corruption showed up the first time I tried to run a filesystem balance, 
basically the day after I set up the filesystem and copied files over.  
I was hoping to get it stable and then add some more disks, but since it 
wasn't stable right off the top, I'm assuming the problem is bigger than 
some bad memory. I ran the stress.sh on two disks connected to ports on 
the motherboard, that seemed to work fine.  And I'm using a pair of WD 
green drives, in case there's an issue with those.  I did order some WD 
red NAS drives, hoping they arrive soon.  I'm running the stress now 
with them connected to the SAS card, trying to only let them run for a 
day to see if something bad happens, it's a 4 port card so if there's a 
problem with a specific port or cable, it could take me a while to find 
it.  I'm hoping it shows up in a somewhat obvious way.  But thanks for 
all the help, the stress.sh runs give me a clear way to try to debug 
this, thanks again for that tip.


--- Corey

On 07/08/2016 05:14 AM, Austin S. Hemmelgarn wrote:

On 2016-07-08 07:14, Tomasz Kusmierz wrote:


Well, I was able to run memtest on the system last night, that 
passed with
flying colors, so I'm now leaning toward the problem being in the 
sas card.

But I'll have to run some more tests.



Seriously use the "stres.sh" for couple of days, When I was running
memtest it was running continuously for 3 days without the error, day
of stres.sh and errors started showing up.
Be VERY careful with trusting any sort of that tool, modern CPU's lye
to you continuously !!!
1. You may think that you've wrote best on the planet code that
bypasses a CPU cache, but in reality since CPU's are multicore you can
end up with overzealous MPMD traping you inside of you cache memory
and all you resting will do is write a page (trapped in cache) read it
from cache (coherency mechanism, not the mis/hit one) will trap you
inside of L3 so you have no clue you don't touch the ram, then CPU
will just dump your page to RAM and "job done"
2. Since coherency problems and real problems with non blocking on
mpmd you can have a DMA controller sucking pages out your own cache,
due to ram being marked as dirty and CPU will try to spare the time
and accelerate the operation to push DMA straigh out of L3 to
somewhere else (mentioning that sine some testers use crazy way of
forcing your ram access via DMA to somewhere and back to force droping
out of L3)
3. This one is actually funny: some testers didn't claim the pages to
the process so for some reason pages that the were using were not
showing up as used / dirty etc so all the testing was done 32kB of L1
... tests were fast thou :)

srters.sh will test operation of the whole system !!! it shifts a lot
of data so disks are engaged, CPU keeps pumping out CRC32 all the time
so it's busy, RAM gets hit nicely as well due to high DMA.

Agreed, never just trust memtest86 or memtest86+.

FWIW< here's the routine I go through to test new RAM:
1. Run regular memtest86 for at least 3 full cycles in full SMP mode 
(F2 while starting up to force SMP).  On some systems this may hang, 
but that's an issue in the BIOS's setup of the CPU and MC, not the 
RAM, and is generally not indicative of a system which will have issues.
2. Run regular memtest86 for at least 3 full cycles in regular UP mode 
(the default on most non-NUMA hardware).
3. Repeat 1 and 2 with memtest86+.  It's diverged enough from regular 
memtest86 that it's functionally a separate tool, and I've seen RAM 
that passes one but not the other on multiple occasions before.
4. Boot SystemRescueCD, download a copy of the Linux sources, and run 
as many allmodconfig builds in parallel as I have CPU's, each with a 
number of make jobs equal to the twice number of CPU's (so each CPU 
ends up running at least two threads).  This forces enough context 
switching to completely trash even the L3 cache on almost any modern 
processor, which means it forces things out to RAM.  It won't hit all 
your RAM, but I've found it to be a relatively reliable way to verify 
the memory bus and the memory controller work properly.
5. Still from SystemRescueCD, use a tool called memtester (essentially 
memtest86, but run from userspace) to check the RAM.
6. Still from SystemRescueCD, use sha1sum to compute SHA-1 hashes of 
all the disks in the system, using at least 8 instances of sha1sum per 
CPU core, and make sure that all the sums for a disk match.
7. Do 6 again, but using cat to compute the sum of a concatenation of 
all the disks in the system (so the individual commands end up being 
`cat /dev/sd? | sha1sum`).  This will rapidly use all available memory 
on the system and keep it in use for quite a while.
8. If I'm using my home server system, I also have a special virtual 
runlevel set up where I spin up 4 times as many VM's as I 

Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Chris Murphy
On Thu, Jul 7, 2016 at 11:40 AM, Chris Murphy  wrote:
> On Thu, Jul 7, 2016 at 10:01 AM, Henk Slager  wrote:
>
>> What the latest debian likes as naming convention I dont know, but in
>> openSuSE @ is a directory in the toplevel volume (ID=5 or ID=0 as
>> alias) and that directory contains subvolumes.
>
> No, opensuse doesn't use @ at all. They use a subvolume called
> .snapshots to contain snapper snapshots.

OK this has changed in openSUSE Tumbleweed. It does create an @
subvolume into which all other subvolumes are created including
.snapshots.

0:install:/mnt # btrfs sub list -t /mnt/
IDgentop levelpath
------
257315   @
25812257@/.snapshots
25937258@/.snapshots/1/snapshot
26014257@/boot/grub2/i386-pc
26115257@/boot/grub2/x86_64-efi
26216257@/opt
26317257@/srv
26418257@/tmp
26519257@/usr/local
26635257@/var/cache
26721257@/var/crash
26823257@/var/lib/libvirt/images
26923257@/var/lib/mailman
27025257@/var/lib/mariadb
27126257@/var/lib/mysql
27226257@/var/lib/named
27328257@/var/lib/pgsql
27435257@/var/log
27529257@/var/opt
27630257@/var/spool
27735257@/var/tmp

The installation time rootfs is
0:install:/mnt # mount | grep btrfs
/dev/vda2 on /mnt type btrfs
(rw,relatime,space_cache,subvolid=259,subvol=/@/.snapshots/1/snapshot)

I don't really understand the point of this additional layer of nesting under @.

I didn't test if it's still changing the default subvolume, rather
than using rootflags=subvol or subvolid.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Henk Slager
On Fri, Jul 8, 2016 at 11:50 PM, Chris Murphy  wrote:
> On Fri, Jul 8, 2016 at 3:39 PM, Chris Murphy  wrote:
>> On Fri, Jul 8, 2016 at 2:08 PM, Kai Herlemann  wrote:
>>
>>> If here any developers read along: I'd like to suggest that there's
>>> automatically made a subvolume "@" by default, which is set as default
>>> subvolume, or a tip to the distribution, that it would made sense to do that
>>> with the installation. It would protect other users against confusion and
>>> work like I had it.
>>
>> I think that upstream won't do that or recommend it. There is already
>> a subvolume created at mkfs time, that's subvolid=5 (a.k.a. 0) and it
>> is set as the default subvolume. I don't see the point in having two
>> of them. If you want it, make it. If your distro wants it, it should
>> be done in the installer, not mkfs.
>>
>> Further I think it's inappropriate to take 'btrfs sub set-default'
>> away from the user. That is a user owned setting. It is not OK for
>> some utility to assert domain over that setting, and depend on it for
>> proper booting. It makes the entire boot process undiscoverable,
>> breaks self-describing boot process which are simpler to understand
>> and troubleshoot, in favor of secret decoder ring booting that now
>> requires even more esoteric knowledge on the part of users. So I think
>> it's a bad design.
>>
>> Instead those utilities should employ rootflags=subvol or subvolid to
>> explicitly use a particular fs tree for rootfs, rather that hide this
>> fact by using subvolume set-default.
>
> The only distro installer I know that works this way out of the box is
> Fedora/Red Hat's Anaconda. It leaves the default subvolume as 5, but
> does not install the OS there. Instead each mountpoint is created as a
> subvolume in that top level, and rootflags kernel parameter and fstab
> are used to assemble those subvolumes per the FHS virtually. It's
> completely discoverable, you can follow each step along the way, it's
> not obscured.
>
> The additional benefit is no nested subvolumes.
>
> A possible improvement for those distros that will likely continue
> doing things the way they are, would be if the kernel code stated what
> fs tree ID was being mounted when the default subvolume is not 5, and
> neither subvol nor subvolid mount options were used. *shrug*

On a running system as non-root:
$ mount | grep "on / type btrfs"
/dev/sda1 on / type btrfs
(rw,noatime,compress=lzo,ssd,discard,space_cache,subvolid=2429,subvol=/@/latestrootfs)

On an image of a disk or some separate disk with rootfs tree mounted
somewhere, I agree that it might look 'hidden'; you will have to
realize that the filesystem is Btrfs and that the default subvol might
not be 5, but  btrfs sub list / gives the answer to what more is in
the pool.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Chris Murphy
On Fri, Jul 8, 2016 at 3:39 PM, Chris Murphy  wrote:
> On Fri, Jul 8, 2016 at 2:08 PM, Kai Herlemann  wrote:
>
>> If here any developers read along: I'd like to suggest that there's
>> automatically made a subvolume "@" by default, which is set as default
>> subvolume, or a tip to the distribution, that it would made sense to do that
>> with the installation. It would protect other users against confusion and
>> work like I had it.
>
> I think that upstream won't do that or recommend it. There is already
> a subvolume created at mkfs time, that's subvolid=5 (a.k.a. 0) and it
> is set as the default subvolume. I don't see the point in having two
> of them. If you want it, make it. If your distro wants it, it should
> be done in the installer, not mkfs.
>
> Further I think it's inappropriate to take 'btrfs sub set-default'
> away from the user. That is a user owned setting. It is not OK for
> some utility to assert domain over that setting, and depend on it for
> proper booting. It makes the entire boot process undiscoverable,
> breaks self-describing boot process which are simpler to understand
> and troubleshoot, in favor of secret decoder ring booting that now
> requires even more esoteric knowledge on the part of users. So I think
> it's a bad design.
>
> Instead those utilities should employ rootflags=subvol or subvolid to
> explicitly use a particular fs tree for rootfs, rather that hide this
> fact by using subvolume set-default.

The only distro installer I know that works this way out of the box is
Fedora/Red Hat's Anaconda. It leaves the default subvolume as 5, but
does not install the OS there. Instead each mountpoint is created as a
subvolume in that top level, and rootflags kernel parameter and fstab
are used to assemble those subvolumes per the FHS virtually. It's
completely discoverable, you can follow each step along the way, it's
not obscured.

The additional benefit is no nested subvolumes.

A possible improvement for those distros that will likely continue
doing things the way they are, would be if the kernel code stated what
fs tree ID was being mounted when the default subvolume is not 5, and
neither subvol nor subvolid mount options were used. *shrug*


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Adam Borowski
On Fri, Jul 08, 2016 at 08:21:16PM +0200, Adam Borowski wrote:
> On Fri, Jul 08, 2016 at 12:02:35PM -0400, Chris Mason wrote:
> > On 07/08/2016 11:02 AM, Gabriel C wrote:
> > > [btrfs_destroy_inode again]
> 
> > Can you please run the attached test program:
> > 
> > gcc -o short-write short-write.c -lpthread
> > ./short-write some-new-file-on-btrfs
> > 
> > I want to see if you're triggering the same problem we've tried to fix, or
> > something else.
> 
> Looks like same, 4.6.3:
[...]
> ... and sda1 is goes ro.
> Single device, noatime,compress=lzo,ssd.
> 
> It's somewhat puzzling that back in the day applying 56244ef15 stopped this
> reproducer for me, yet somehow it triggers again.

The above on 4.6.3 triggered pretty immediately.  I then compiled fresh
4.7-rc6+ (today's Linus' master), which did trigger only after a lot of time
and effort.  First I tried on freshly formatted 2TB spinning rust, no luck.
Then on 1GB rust, almost full of a mixture of crap.  Then on my regular ssd
-- it survived an hour of so with little concurrent use, then went boom only
late of a kernel compile on that filesystem.

Same backtrace.

-- 
An imaginary friend squared is a real enemy.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Chris Murphy
On Fri, Jul 8, 2016 at 2:08 PM, Kai Herlemann  wrote:

> If here any developers read along: I'd like to suggest that there's
> automatically made a subvolume "@" by default, which is set as default
> subvolume, or a tip to the distribution, that it would made sense to do that
> with the installation. It would protect other users against confusion and
> work like I had it.

I think that upstream won't do that or recommend it. There is already
a subvolume created at mkfs time, that's subvolid=5 (a.k.a. 0) and it
is set as the default subvolume. I don't see the point in having two
of them. If you want it, make it. If your distro wants it, it should
be done in the installer, not mkfs.

Further I think it's inappropriate to take 'btrfs sub set-default'
away from the user. That is a user owned setting. It is not OK for
some utility to assert domain over that setting, and depend on it for
proper booting. It makes the entire boot process undiscoverable,
breaks self-describing boot process which are simpler to understand
and troubleshoot, in favor of secret decoder ring booting that now
requires even more esoteric knowledge on the part of users. So I think
it's a bad design.

Instead those utilities should employ rootflags=subvol or subvolid to
explicitly use a particular fs tree for rootfs, rather that hide this
fact by using subvolume set-default.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON

2016-07-08 Thread Liu Bo
On Fri, Jul 08, 2016 at 06:05:16PM +0200, David Sterba wrote:
> On Tue, May 03, 2016 at 04:30:54PM -0700, Liu Bo wrote:
> > > Just a heads up that this seems to introduce a valid warning, since it now
> > > can goto error before the first initializing use of path:
> > > 
> > > fs/btrfs/volumes.c: In function 'btrfs_balance':
> > > fs/btrfs/volumes.c:3601:2: warning: 'path' may be used uninitialized
> > > in this function [-Wmaybe-uninitialized]
> > >   btrfs_free_path(path);
> > >   ^
> > > fs/btrfs/volumes.c:3385:21: note: 'path' was declared here
> > >   struct btrfs_path *path;
> > >  ^
> > > (it's really in __btrfs_balance which got inlined, so gcc thinks it's
> > > at the call site).
> > > 
> > > Simply setting path = NULL at the beginning of __btrfs_balance fixes it, 
> > > since
> > > btrfs_free_path allows NULL values.
> > 
> > That's right, it's weird that I didn't get this warning while testing it.
> > 
> > Thanks for catching it, Holger.
> 
> Please send a v2, the patch is desiable.

Oh, I almost forgot this one, thanks for the reminder.

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure

2016-07-08 Thread Liu Bo
On Fri, Jul 08, 2016 at 06:01:49PM +0200, David Sterba wrote:
> On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> > eb->io_pages is set in read_extent_buffer_pages().
> > 
> > In case of readpage failure, for pages that have been added to bio,
> > it calls bio_endio and later readpage_io_failed_hook() does the work.
> > 
> > When this eb's page (couldn't be the 1st page) fails to add itself to bio
> > due to failure in merge_bio(), it cannot decrease eb->io_pages via 
> > bio_endio,
> >  and ends up with a memory leak eventually.
> > 
> > This lets __do_readpage propagate errors to callers and adds the
> >  'atomic_dec(>io_pages)'.
> 
> I'm not sure, but could we lose some error values from __do_readpage?
> Ie. return 0 even if there was an error in a page that's in the middle
> (not the first, not the last).
> 
> The loop in __do_readpage iterates while (cur <= end), and ret is only
> set by submit_extent_page, but the loop does not exit immediatelly. So
> we can detect error, set page error state bit, but next loop will
> overwrite ret with 0 (if the page submission was ok).
> 
> Then we still don't decrement the io_pages as needed.

Right, it still has that problem, then the possible way I can see is to break
the while (cur <= end) loop when we fail on submit_extent_page() and
pass an error up to its caller and we can do the rest eb->io_pages cleanup work 
in
read_extent_buffer_pages(), just like how we did in write_one_eb()
(this was already suggested by Josef, but seems I was off the right track).

This also assumes that if one page fails on submit_extent_page(), it's
likely for the rest pages to fail as well.

What do you think?

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Kai Herlemann

Am 07.07.2016 um 19:40 schrieb Chris Murphy:

And very clearly from the OP's output from 'btrfs sub list' there are
no subvolumes with @ in the path, so there is no subvolume @, nor are
there any subvolumes contained in a directory @.

[...]

Anyway the reason why the command fails is stated in the error
message. The system appears to be installed in the top level of the
file system (subvolid=5), and that can't be deleted. First it's the
immutable first subvolume of a Btrfs file system, and second it's
populated with other subvolumes which would inhibit its removal even
if it weren't the top level subvolume.

What can be done is delete the directories in the top level, retaining
the subvolumes that are there.
Thank you, that was it: there was really no subvolume named @ existing. 
Thank you to Henk and Andrei, too.
I didn't believed that, although there was no @ from ot the output of 
"btrfs sub list", because all websites that dealt with this topic and 
which I used for research statet that a subvolume named @ would 
automatically be created (or I misunderstood the sites), and secondly, 
because the ID of the top level volume is in my case 5, and I 
(mis)understand, in cases where's the subvolume "@" automatically 
created, the ID of that subvolume would be also 5.
I created now myself a subvolume "@" on the top level volume, moved then 
all the data from the snapshot, which I used the last days, to the new 
subvolume, and deleted then all data from the top level volume, except 
the sub level volume @ of course, and made previously a backup snapshot 
from the top level volume. Other users who reading later here and want 
to move their data from top level volume should able to do the same.


If here any developers read along: I'd like to suggest that there's 
automatically made a subvolume "@" by default, which is set as default 
subvolume, or a tip to the distribution, that it would made sense to do 
that with the installation. It would protect other users against 
confusion and work like I had it.


Thank you,
Kai
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-08 Thread Andrei Borzenkov
07.07.2016 15:17, Kai Herlemann пишет:
> Hi,
> 
> I want to rollback a snapshot and have done this by execute "btrfs sub
> set-default / 618".
> Now I want to delete the old top volume to save space, but google and
> manuals didn't helped.
> 
> I mounted for the following the root volume at /mnt/gparted with
> subvolid=0, subvol=/ has the same effect.
> Usually, the top volume is saved in /@, so I would be able to delete it
> by execute "btrfs sub delete /@" (or move at first @ to @_badroot and
> the snapshot to @). But that isn't possible, the output of that command
> is "ERROR: cannot access subvolume /@: No such file or directory".
> I've posted the output of "btrfs sub list /mnt/gparted" at
> http://pastebin.com/r7WNbJq8. As you can see, there's no subvolume named @.
> 
> I have the same problem with my /home/ partition.
> 
> Output of "uname -a" (self-compiled kernel):
> Linux debian-linux 4.1.26 #1 SMP Wed Jun 8 18:40:04 CEST 2016 x86_64
> GNU/Linux
> 

You need to ask on your distribution list; this is really not a question
of btrfs but rather how distribution manages snapshots.

But if you originally installed in top level subvolume, then you have no
way to delete it. You may try to manually clean content of top level
subvolume if you need to free space.

That was initial implementation done by (open)SUSE and they changed it
later to install in subvolume from the very start. But information you
provided is not enough to know how system was installed originally.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Duncan
Chris Mason posted on Fri, 08 Jul 2016 12:02:35 -0400 as excerpted:

> Can you please run the attached test program:

Umm... you want him to run it on the affected 4.6.x and late 4.7-rcs, not 
on the unaffected 4.5.x, correct?

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Adam Borowski
On Fri, Jul 08, 2016 at 12:02:35PM -0400, Chris Mason wrote:
> On 07/08/2016 11:02 AM, Gabriel C wrote:
> > [btrfs_destroy_inode again]

> Can you please run the attached test program:
> 
> gcc -o short-write short-write.c -lpthread
> ./short-write some-new-file-on-btrfs
> 
> I want to see if you're triggering the same problem we've tried to fix, or
> something else.

Looks like same, 4.6.3:

[239214.345100] [ cut here ]
[239214.345122] WARNING: CPU: 0 PID: 28038 at fs/btrfs/inode.c:9261 
btrfs_destroy_inode+0x22b/0x2a0
[239214.345127] Modules linked in: vboxpci(O) vboxnetadp(O) vboxnetflt(O) 
vboxdrv(O) nls_utf8 isofs loop tun usb_storage fuse nvidia(PO)
[239214.345153] CPU: 0 PID: 28038 Comm: short-write Tainted: P   O
4.6.3-x32+ #8
[239214.345158] Hardware name: System manufacturer System Product Name/M4A77T, 
BIOS 240105/18/2011
[239214.345162]   813819b0  

[239214.345170]  8108ce92  88010c2d71c8 
880222670800
[239214.345177]   88010c2d71c8 ff9c 
812b1e0b
[239214.345183] Call Trace:
[239214.345194]  [] ? dump_stack+0x46/0x66
[239214.345202]  [] ? __warn+0xe2/0x100
[239214.345209]  [] ? btrfs_destroy_inode+0x22b/0x2a0
[239214.345216]  [] ? do_unlinkat+0x12c/0x2f0
[239214.345224]  [] ? entry_SYSCALL_64_fastpath+0x17/0x93
[239214.345229] ---[ end trace ae49f2a4ae4a26ea ]---
[239214.404237] [ cut here ]
[239214.404258] WARNING: CPU: 5 PID: 28038 at fs/btrfs/extent-tree.c:4233 
btrfs_free_reserved_data_space_noquota+0x63/0x80
[239214.404262] Modules linked in: vboxpci(O) vboxnetadp(O) vboxnetflt(O) 
vboxdrv(O) nls_utf8 isofs loop tun usb_storage fuse nvidia(PO)
[239214.404286] CPU: 5 PID: 28038 Comm: short-write Tainted: PW  O
4.6.3-x32+ #8
[239214.404291] Hardware name: System manufacturer System Product Name/M4A77T, 
BIOS 240105/18/2011
[239214.404295]   813819b0  

[239214.404303]  8108ce92 1000 880229415c00 
037ea000
[239214.404309]  8800c2ed7da0 88000dd3d800  
81287143
[239214.404315] Call Trace:
[239214.404325]  [] ? dump_stack+0x46/0x66
[239214.404333]  [] ? __warn+0xe2/0x100
[239214.404341]  [] ? 
btrfs_free_reserved_data_space_noquota+0x63/0x80
[239214.404349]  [] ? btrfs_delalloc_release_space+0x31/0x60
[239214.404356]  [] ? __btrfs_buffered_write+0x591/0x680
[239214.404364]  [] ? btrfs_file_write_iter+0x182/0x550
[239214.404372]  [] ? __vfs_write+0xa9/0xe0
[239214.404379]  [] ? vfs_write+0xac/0x1a0
[239214.404387]  [] ? SyS_pwrite64+0x88/0xa0
[239214.404394]  [] ? entry_SYSCALL_64_fastpath+0x17/0x93
[239214.404399] ---[ end trace ae49f2a4ae4a26eb ]---

... and sda1 is goes ro.
Single device, noatime,compress=lzo,ssd.

It's somewhat puzzling that back in the day applying 56244ef15 stopped this
reproducer for me, yet somehow it triggers again.

-- 
An imaginary friend squared is a real enemy.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-08 Thread Henk Slager
>> Device is GOOD
>>
>> I also created a big file with dd using /dev/urandom with the same size
>> as my flash drive, copied it once and read it three times. The SHA-1
>> checksum is always the same and matches the original one on the hard disk.
>>
>> So after much testing I feel I can conclude that my USB flash drive is
>> not fake and it is not defective.
>>
> For what it's worth, there's multiple other things that could cause similar
> issues.  I've had a number of cases where bad USB hubs or poorly designed
> (or just buggy or failing) USB controllers caused similar data corruption,
> the most recent one being an issue with both a bad USB 2.0 hub (which did
> not properly implement the USB standard, counterfeit USB devices come in all
> types) and a malfunctioning USB 3.0 controller (which did not properly
> account for things that didn't properly implement the standard and had no
> recovery code to handle this in the drivers).  I ended up in most cases
> checking the ports using other USB devices (at least a keyboard, a mouse,
> and a USB serial adapter).

Similar as Austin, I also want to note that there might be USB related
issues that only pop-up after some time and not in tests.

For example, this weekend I connected a 2.5inch 500G drive with its
Y-cable to a H87M-Pro board that is fed by a 80+Gold PSU, despite its
many 'bad sectors' I remembered from 2 years ago in a btrfs raid1
setup. This 500G disk has worked well for almost 2 years connected to
a 7-inch eeepc4G, XFS formatted. But with the H87M-Pro I just now saw
that it dropped off the USB every now and then, causing trouble for
Btrfs.

For connecting harddisks to phones, I once bought an external powered
hub, and I put that between the board the the 500G disk => that made
it all stable, no disconnects and Btrfs works fine as expected. I had
similar issues on another PC with a Sandisk Extreme 64G USB3 stick,
but that was likely a protocol issue.

So maybe try to use the stick with your use case in another HW setup,
hopefully then it is stable for a longer time than the few days now.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-08 Thread Austin S. Hemmelgarn

On 2016-07-08 12:10, Francesco Turco wrote:

On 2016-07-07 19:57, Chris Murphy wrote:

Use F3 to test flash:
http://oss.digirati.com.br/f3/


I tested my USB flash drive with F3 as you suggested, and there's no
indication it is a fake device.

---

# f3probe --destructive /dev/sdb
F3 probe 6.0
Copyright (C) 2010 Digirati Internet LTDA.
This is free software; see the source for copying conditions.

WARNING: Probing normally takes from a few seconds to 15 minutes, but
 it can take longer. Please be patient.

Good news: The device `/dev/sdb' is the real thing

Device geometry:
 *Usable* size: 57.69 GB (120979456 blocks)
Announced size: 57.69 GB (120979456 blocks)
Module: 64.00 GB (2^36 Bytes)
Approximate cache size: 0.00 Byte (0 blocks), need-reset=no
   Physical block size: 512.00 Byte (2^9 Bytes)

Probe time: 2'23"

--

$ f3read /run/media/fturco/a7d8a7b1-e0c2-4fbb-879f-e17046989f3c
  SECTORS  ok/corrupted/changed/overwritten
Validating file 1.h2w ... 2097152/0/  0/  0
Validating file 2.h2w ... 2097152/0/  0/  0
Validating file 3.h2w ... 2097152/0/  0/  0
Validating file 4.h2w ... 2097152/0/  0/  0
Validating file 5.h2w ... 2097152/0/  0/  0
Validating file 6.h2w ... 2097152/0/  0/  0
Validating file 7.h2w ... 2097152/0/  0/  0
Validating file 8.h2w ... 2097152/0/  0/  0
Validating file 9.h2w ... 2097152/0/  0/  0
Validating file 10.h2w ... 2097152/0/  0/  0
Validating file 11.h2w ... 2097152/0/  0/  0
Validating file 12.h2w ... 2097152/0/  0/  0
Validating file 13.h2w ... 2097152/0/  0/  0
Validating file 14.h2w ... 2097152/0/  0/  0
Validating file 15.h2w ... 2097152/0/  0/  0
Validating file 16.h2w ... 2097152/0/  0/  0
Validating file 17.h2w ... 2097152/0/  0/  0
Validating file 18.h2w ... 2097152/0/  0/  0
Validating file 19.h2w ... 2097152/0/  0/  0
Validating file 20.h2w ... 2097152/0/  0/  0
Validating file 21.h2w ... 2097152/0/  0/  0
Validating file 22.h2w ... 2097152/0/  0/  0
Validating file 23.h2w ... 2097152/0/  0/  0
Validating file 24.h2w ... 2097152/0/  0/  0
Validating file 25.h2w ... 2097152/0/  0/  0
Validating file 26.h2w ... 2097152/0/  0/  0
Validating file 27.h2w ... 2097152/0/  0/  0
Validating file 28.h2w ... 2097152/0/  0/  0
Validating file 29.h2w ... 2097152/0/  0/  0
Validating file 30.h2w ... 2097152/0/  0/  0
Validating file 31.h2w ... 2097152/0/  0/  0
Validating file 32.h2w ... 2097152/0/  0/  0
Validating file 33.h2w ... 2097152/0/  0/  0
Validating file 34.h2w ... 2097152/0/  0/  0
Validating file 35.h2w ... 2097152/0/  0/  0
Validating file 36.h2w ... 2097152/0/  0/  0
Validating file 37.h2w ... 2097152/0/  0/  0
Validating file 38.h2w ... 2097152/0/  0/  0
Validating file 39.h2w ... 2097152/0/  0/  0
Validating file 40.h2w ... 2097152/0/  0/  0
Validating file 41.h2w ... 2097152/0/  0/  0
Validating file 42.h2w ... 2097152/0/  0/  0
Validating file 43.h2w ... 2097152/0/  0/  0
Validating file 44.h2w ... 2097152/0/  0/  0
Validating file 45.h2w ... 2097152/0/  0/  0
Validating file 46.h2w ... 2097152/0/  0/  0
Validating file 47.h2w ... 2097152/0/  0/  0
Validating file 48.h2w ... 2097152/0/  0/  0
Validating file 49.h2w ... 2097152/0/  0/  0
Validating file 50.h2w ... 2097152/0/  0/  0
Validating file 51.h2w ... 2097152/0/  0/  0
Validating file 52.h2w ... 2097152/0/  0/  0
Validating file 53.h2w ... 2097152/0/  0/  0
Validating file 54.h2w ... 2097152/0/  0/  0
Validating file 55.h2w ... 2097152/0/  0/  0
Validating file 56.h2w ... 1364266/0/  0/  0

  Data OK: 55.65 GB (116707626 sectors)
Data LOST: 0.00 Byte (0 sectors)
   Corrupted: 0.00 Byte (0 sectors)
Slightly changed: 0.00 Byte (0 sectors)
 Overwritten: 0.00 Byte (0 sectors)
Average reading speed: 34.73 MB/s




Read more, and also includes a much faster alternative for GNOME:
https://blogs.gnome.org/hughsie/2015/01/28/detecting-fake-flash/


I also tested my flash drive with gnome-multi-writer-probe, and it says
it is not fake:

# gnome-multi-writer-probe 

[bug report] btrfs: root->fs_info cleanup, add fs_info convenience variables

2016-07-08 Thread Dan Carpenter
Hello Jeff Mahoney,

This is a semi-automatic email about new static checker warnings.

The patch b286384aac32: "btrfs: root->fs_info cleanup, add fs_info
convenience variables" from Jun 22, 2016, leads to the following
Smatch complaint:

fs/btrfs/export.c:238 btrfs_get_name()
 warn: variable dereferenced before check 'inode' (see line 226)

fs/btrfs/export.c
   225  struct inode *dir = d_inode(parent);
   226  struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 ^^^
New dereference.

   227  struct btrfs_path *path;
   228  struct btrfs_root *root = BTRFS_I(dir)->root;
   229  struct btrfs_inode_ref *iref;
   230  struct btrfs_root_ref *rref;
   231  struct extent_buffer *leaf;
   232  unsigned long name_ptr;
   233  struct btrfs_key key;
   234  int name_len;
   235  int ret;
   236  u64 ino;
   237  
   238  if (!dir || !inode)
 ^
Old code assumed it can be NULL.

   239  return -EINVAL;
   240  

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-08 Thread Francesco Turco
On 2016-07-07 19:57, Chris Murphy wrote:
> Use F3 to test flash:
> http://oss.digirati.com.br/f3/

I tested my USB flash drive with F3 as you suggested, and there's no
indication it is a fake device.

---

# f3probe --destructive /dev/sdb
F3 probe 6.0
Copyright (C) 2010 Digirati Internet LTDA.
This is free software; see the source for copying conditions.

WARNING: Probing normally takes from a few seconds to 15 minutes, but
 it can take longer. Please be patient.

Good news: The device `/dev/sdb' is the real thing

Device geometry:
 *Usable* size: 57.69 GB (120979456 blocks)
Announced size: 57.69 GB (120979456 blocks)
Module: 64.00 GB (2^36 Bytes)
Approximate cache size: 0.00 Byte (0 blocks), need-reset=no
   Physical block size: 512.00 Byte (2^9 Bytes)

Probe time: 2'23"

--

$ f3read /run/media/fturco/a7d8a7b1-e0c2-4fbb-879f-e17046989f3c
  SECTORS  ok/corrupted/changed/overwritten
Validating file 1.h2w ... 2097152/0/  0/  0
Validating file 2.h2w ... 2097152/0/  0/  0
Validating file 3.h2w ... 2097152/0/  0/  0
Validating file 4.h2w ... 2097152/0/  0/  0
Validating file 5.h2w ... 2097152/0/  0/  0
Validating file 6.h2w ... 2097152/0/  0/  0
Validating file 7.h2w ... 2097152/0/  0/  0
Validating file 8.h2w ... 2097152/0/  0/  0
Validating file 9.h2w ... 2097152/0/  0/  0
Validating file 10.h2w ... 2097152/0/  0/  0
Validating file 11.h2w ... 2097152/0/  0/  0
Validating file 12.h2w ... 2097152/0/  0/  0
Validating file 13.h2w ... 2097152/0/  0/  0
Validating file 14.h2w ... 2097152/0/  0/  0
Validating file 15.h2w ... 2097152/0/  0/  0
Validating file 16.h2w ... 2097152/0/  0/  0
Validating file 17.h2w ... 2097152/0/  0/  0
Validating file 18.h2w ... 2097152/0/  0/  0
Validating file 19.h2w ... 2097152/0/  0/  0
Validating file 20.h2w ... 2097152/0/  0/  0
Validating file 21.h2w ... 2097152/0/  0/  0
Validating file 22.h2w ... 2097152/0/  0/  0
Validating file 23.h2w ... 2097152/0/  0/  0
Validating file 24.h2w ... 2097152/0/  0/  0
Validating file 25.h2w ... 2097152/0/  0/  0
Validating file 26.h2w ... 2097152/0/  0/  0
Validating file 27.h2w ... 2097152/0/  0/  0
Validating file 28.h2w ... 2097152/0/  0/  0
Validating file 29.h2w ... 2097152/0/  0/  0
Validating file 30.h2w ... 2097152/0/  0/  0
Validating file 31.h2w ... 2097152/0/  0/  0
Validating file 32.h2w ... 2097152/0/  0/  0
Validating file 33.h2w ... 2097152/0/  0/  0
Validating file 34.h2w ... 2097152/0/  0/  0
Validating file 35.h2w ... 2097152/0/  0/  0
Validating file 36.h2w ... 2097152/0/  0/  0
Validating file 37.h2w ... 2097152/0/  0/  0
Validating file 38.h2w ... 2097152/0/  0/  0
Validating file 39.h2w ... 2097152/0/  0/  0
Validating file 40.h2w ... 2097152/0/  0/  0
Validating file 41.h2w ... 2097152/0/  0/  0
Validating file 42.h2w ... 2097152/0/  0/  0
Validating file 43.h2w ... 2097152/0/  0/  0
Validating file 44.h2w ... 2097152/0/  0/  0
Validating file 45.h2w ... 2097152/0/  0/  0
Validating file 46.h2w ... 2097152/0/  0/  0
Validating file 47.h2w ... 2097152/0/  0/  0
Validating file 48.h2w ... 2097152/0/  0/  0
Validating file 49.h2w ... 2097152/0/  0/  0
Validating file 50.h2w ... 2097152/0/  0/  0
Validating file 51.h2w ... 2097152/0/  0/  0
Validating file 52.h2w ... 2097152/0/  0/  0
Validating file 53.h2w ... 2097152/0/  0/  0
Validating file 54.h2w ... 2097152/0/  0/  0
Validating file 55.h2w ... 2097152/0/  0/  0
Validating file 56.h2w ... 1364266/0/  0/  0

  Data OK: 55.65 GB (116707626 sectors)
Data LOST: 0.00 Byte (0 sectors)
   Corrupted: 0.00 Byte (0 sectors)
Slightly changed: 0.00 Byte (0 sectors)
 Overwritten: 0.00 Byte (0 sectors)
Average reading speed: 34.73 MB/s



> Read more, and also includes a much faster alternative for GNOME:
> https://blogs.gnome.org/hughsie/2015/01/28/detecting-fake-flash/

I also tested my flash drive with gnome-multi-writer-probe, and it says
it is not fake:

# gnome-multi-writer-probe /dev/sdb
Device is GOOD

I also created a big 

Re: [PATCH] Btrfs: fix unexpected balance crash due to BUG_ON

2016-07-08 Thread David Sterba
On Tue, May 03, 2016 at 04:30:54PM -0700, Liu Bo wrote:
> > Just a heads up that this seems to introduce a valid warning, since it now
> > can goto error before the first initializing use of path:
> > 
> > fs/btrfs/volumes.c: In function 'btrfs_balance':
> > fs/btrfs/volumes.c:3601:2: warning: 'path' may be used uninitialized
> > in this function [-Wmaybe-uninitialized]
> >   btrfs_free_path(path);
> >   ^
> > fs/btrfs/volumes.c:3385:21: note: 'path' was declared here
> >   struct btrfs_path *path;
> >  ^
> > (it's really in __btrfs_balance which got inlined, so gcc thinks it's
> > at the call site).
> > 
> > Simply setting path = NULL at the beginning of __btrfs_balance fixes it, 
> > since
> > btrfs_free_path allows NULL values.
> 
> That's right, it's weird that I didn't get this warning while testing it.
> 
> Thanks for catching it, Holger.

Please send a v2, the patch is desiable.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Chris Mason



On 07/08/2016 11:02 AM, Gabriel C wrote:

On 08.07.2016 14:41, Chris Mason wrote:




On 07/08/2016 05:57 AM, Gabriel C wrote:

2016-07-07 21:21 GMT+02:00 Chris Mason :



On 07/07/2016 06:24 AM, Gabriel C wrote:


Hi,

while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
other versions )
I trigger the following :



I definitely thought we had this fixed in v4.7-rc.  Can you easily
fsck this filesystem?  Something strange is going on.


Yes , btrfs check and btrfs check  --check-data-csum are fine , no
errors found.

If you want me to test any patches let me know.



Can you please try a v4.5 stable kernel?  I'm curious if this really
is the same regression that I tried to fix in v4.7



I'm on linux 4.5.7 now and everything is fine. I'm writing this email
from thunderbird.. which was not
possible in 4.6.3 or 4.7.-rc.

Let me know you want me to test other kernels or whatever else may help
fixing this problem.



Can you please run the attached test program:

gcc -o short-write short-write.c -lpthread
./short-write some-new-file-on-btrfs

I want to see if you're triggering the same problem we've tried to fix, 
or something else.


This program will create a new file, or overwrite an existing file. 
Don't pass it something you care about ;)


-chris

#define _FILE_OFFSET_BITS 64
#define _GNU_SOURCE

#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define ROUNDS 8
#define BUFSIZE (1024 * 1024)
#define MAXSIZE (128 * 1024ULL * 1024ULL * 1024ULL)

void *dontneed(void *arg)
{
	char *p = arg;
	int ret;

	while(1) {
		ret = madvise(p, BUFSIZE/4, MADV_DONTNEED);
		if (ret) {
			perror("madvise");
			exit(1);
		}
	}
}

int main(int ac, char **av) {
	int ret;
	int fd;
	char *filename;
	unsigned long offset;
	char *buf;
	int i;
	pthread_t tid;

	if (ac != 2) {
		fprintf(stderr, "usage: short-write filename\n");
		exit(1);
	}

	buf = mmap(NULL, BUFSIZE, PROT_READ|PROT_WRITE,
		   MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
	if (buf == MAP_FAILED) {
		perror("mmap");
		exit(1);
	}
	memset(buf, 'a', BUFSIZE);
	filename = av[1];

	ret = pthread_create(, NULL, dontneed, buf);
	if (ret) {
		fprintf(stderr, "error %d from pthread_create\n", ret);
		exit(1);
	}

	ret = pthread_detach(tid);
	if (ret) {
		fprintf(stderr, "pthread detach failed %d\n", ret);
		exit(1);
	}

	while (1) {
		fd = open(filename, O_RDWR | O_CREAT, 0600);
		if (fd < 0) {
			perror("open");
			exit(1);
		}

		for (i = 0; i < ROUNDS; i++) {
			int this_write = BUFSIZE;

			offset = rand() % MAXSIZE;
			ret = pwrite(fd, buf, this_write, offset);
			if (ret < 0) {
perror("pwrite");
exit(1);
			} else if (ret != this_write) {
fprintf(stderr, "short write to %s offset %lu ret %d\n",
	filename, offset, ret);
exit(1);
			}
			if (i == ROUNDS - 1) {
ret = sync_file_range(fd, offset, 4096,
SYNC_FILE_RANGE_WRITE);
if (ret < 0) {
	perror("sync_file_range");
	exit(1);
}
			}
		}
		ret = ftruncate(fd, 0);
		if (ret < 0) {
			perror("ftruncate");
			exit(1);
		}
		ret = close(fd);
		if (ret) {
			perror("close");
			exit(1);
		}
		ret = unlink(filename);
		if (ret) {
			perror("unlink");
			exit(1);
		}

	}
	return 0;
}


Re: [PATCH v2] Btrfs: fix eb memory leak due to readpage failure

2016-07-08 Thread David Sterba
On Fri, Jun 03, 2016 at 12:08:38PM -0700, Liu Bo wrote:
> eb->io_pages is set in read_extent_buffer_pages().
> 
> In case of readpage failure, for pages that have been added to bio,
> it calls bio_endio and later readpage_io_failed_hook() does the work.
> 
> When this eb's page (couldn't be the 1st page) fails to add itself to bio
> due to failure in merge_bio(), it cannot decrease eb->io_pages via bio_endio,
>  and ends up with a memory leak eventually.
> 
> This lets __do_readpage propagate errors to callers and adds the
>  'atomic_dec(>io_pages)'.

I'm not sure, but could we lose some error values from __do_readpage?
Ie. return 0 even if there was an error in a page that's in the middle
(not the first, not the last).

The loop in __do_readpage iterates while (cur <= end), and ret is only
set by submit_extent_page, but the loop does not exit immediatelly. So
we can detect error, set page error state bit, but next loop will
overwrite ret with 0 (if the page submission was ok).

Then we still don't decrement the io_pages as needed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: fix WARNING in btrfs_select_ref_head()

2016-07-08 Thread David Sterba
On Mon, Jun 20, 2016 at 09:18:52AM +0800, Wang Xiaoguang wrote:
> @@ -2665,7 +2665,10 @@ static noinline int __btrfs_run_delayed_refs(struct 
> btrfs_trans_handle *trans,
>  
>   btrfs_free_delayed_extent_op(extent_op);
>   if (ret) {
> + spin_lock(_refs->lock);
>   locked_ref->processing = 0;
> + delayed_refs->num_heads_ready++;
> + spin_unlock(_refs->lock);
>   btrfs_delayed_ref_unlock(locked_ref);
>   btrfs_put_delayed_ref(ref);
>   btrfs_debug(fs_info, "run_one_delayed_ref returned %d", 
> ret);

I don't feel qualified to review this and add it to the 4.8 first pull.
The patch will be in for-next though.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Gabriel C

On 08.07.2016 14:41, Chris Mason wrote:




On 07/08/2016 05:57 AM, Gabriel C wrote:

2016-07-07 21:21 GMT+02:00 Chris Mason :



On 07/07/2016 06:24 AM, Gabriel C wrote:


Hi,

while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
other versions )
I trigger the following :



I definitely thought we had this fixed in v4.7-rc.  Can you easily 
fsck this filesystem?  Something strange is going on.


Yes , btrfs check and btrfs check  --check-data-csum are fine , no 
errors found.


If you want me to test any patches let me know.



Can you please try a v4.5 stable kernel?  I'm curious if this really 
is the same regression that I tried to fix in v4.7




I'm on linux 4.5.7 now and everything is fine. I'm writing this email 
from thunderbird.. which was not

possible in 4.6.3 or 4.7.-rc.

Let me know you want me to test other kernels or whatever else may help 
fixing this problem.


Regards,

Gabriel C

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: add the error message when open fails

2016-07-08 Thread David Sterba
On Fri, Jun 10, 2016 at 08:32:12AM +0900, Tsutomu Itoh wrote:
> Hi, David,
> 
> On 2016/06/09 22:13, David Sterba wrote:
> > On Thu, Jun 09, 2016 at 10:23:15AM +0900, Tsutomu Itoh wrote:
> >> When open in btrfs_open_devices failed, only the following message is
> >> displayed. Therefore the user doesn't understand the reason why open
> >> failed.
> >>
> >>   # btrfs check /dev/sdb8
> >>   Couldn't open file system
> >>
> >> This patch adds the error message when open fails.
> >
> > I think the message should be printed by the caller, not by the helper.
> 
> However in this case, error device name is not understood in the caller.

Right, makes more sense to print from within btrfs_open_devices.
Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] don't background blkdev_put()

2016-07-08 Thread David Sterba
On Thu, Jun 23, 2016 at 09:01:53PM +0800, Anand Jain wrote:
> As of now we are calling blkdev_put() under call_rcu,
> actually which isn't necessary. Unless I am missing
> something. This is a try out patch to know the same.

Moving the blkdev_put after sync and invalidate, and out of the rcu
callback looks safe (and better) to me. Please create a helper for the
code that does

> + if (device->bdev && device->writeable) {
> + sync_blockdev(device->bdev);
> + invalidate_bdev(device->bdev);
> + }
> + if (device->bdev)
> + blkdev_put(device->bdev, device->mode);

It's repeated 3 times, justifies a helper.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix BUG_ON in btrfs_submit_compressed_write

2016-07-08 Thread David Sterba
On Thu, Jun 23, 2016 at 10:41:11AM -0700, Liu Bo wrote:
> On Thu, Jun 23, 2016 at 11:09:52AM +0200, David Sterba wrote:
> > On Wed, Jun 22, 2016 at 06:32:06PM -0700, Liu Bo wrote:
> > > This is similar to btrfs_submit_compressed_read(), if we fail after
> > > bio is allocated, then we can use bio_endio() and errors are saved
> > >  in bio->bi_error.  But please note that we don't return errors to
> > > its caller because the caller assumes it won't call endio to cleanup
> > > on error.
> > 
> > This sounds strange, where do we notice that some of the bios failed?
> 
> bio_endio()
>   -> end_compressed_bio_write()
>  -> end_compressed_writeback()
> -> mapping_set_error(inode->i_mapping, -EIO);

Thanks. We use the same logic in btrfs_submit_compressed_read as you
mention but I missed that first. Good riddance of the bug-ons.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of space error even though there's 100 GB unused?

2016-07-08 Thread Henk Slager
On Fri, Jul 8, 2016 at 9:22 AM, Stanislaw Kaminski
 wrote:
> Huh.
>
> I left defrag running overnight, and now I'm back to my >200 GiB free
> space. Also, I got no "out of space" messages in Transmission, and it
> successfully downloaded few GBs.
>
> But in dmesg I have 209 traces, see attached.
>
> Does that say anything to you?

I get some vague clue about the problem, but no-one seems to know
exactly the root cause(s).
The 4.6.2 code where the warning comes from is this:
...
/*
 * Called if we need to clear a data reservation for this inode
 * Normally in a error case.
 *
 * This one will *NOT* use accurate qgroup reserved space API, just for case
 * which we can't sleep and is sure it won't affect qgroup reserved space.
 * Like clear_bit_hook().
 */
void btrfs_free_reserved_data_space_noquota(struct inode *inode, u64 start,
u64 len)
{
   struct btrfs_root *root = BTRFS_I(inode)->root;
   struct btrfs_space_info *data_sinfo;

   /* Make sure the range is aligned to sectorsize */
   len = round_up(start + len, root->sectorsize) -
   round_down(start, root->sectorsize);
   start = round_down(start, root->sectorsize);

   data_sinfo = root->fs_info->data_sinfo;
   spin_lock(_sinfo->lock);
   if (WARN_ON(data_sinfo->bytes_may_use < len))
   data_sinfo->bytes_may_use = 0;
   else
   data_sinfo->bytes_may_use -= len;
   trace_btrfs_space_reservation(root->fs_info, "space_info",
   data_sinfo->flags, len, 0);
   spin_unlock(_sinfo->lock);
}
...

I think the system is still 'recovering' from getting stuck earlier.
What exactly that is, I don't know. You would probably need to enable
more debugging facilities in order to figure out from which file or
inode the problem comes from. I don't know if you can compile a 4.7
kernel for this Kirkwood SoC, but that would be one way forward. (BTW,
is it a 88F6281 or  a 88F6282 ?)

As far as I remember, Josef Basik has posted some patches that could
benefit this case, I am not sure if they made it in 4.7, but that is
what I think I would try.

Otherwise, it are workarounds:
- you could have a look at the cpuload during defrag and normal
operation and see how it relates to the rate of issuing this warning
- add mount option noatime
- as it looks like this this fs is (also) a torrent-client target, you
can put the torrents in a directoru ot subvol with noCoW flag set or
mount the whole fs with nodatacow
- again clean cache and then mount with space_cache=v2. Then new
mounts will use v2 then automatically. Only thing I can say it that
helped me getting out of kernel crash situation with 4.6.2. 4.7.0-rc5
did not crash on the same fs, so I got it working again (
de-allocations and cleanups, the fs is almost exclusively a
btrfs-receive target)
- connect the fs to a multi-core x86_64 system running same kernel
version for some time and see if you can reproduce the same type of
WARN_ONs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs: fix free space calculation in dump_space_info()

2016-07-08 Thread David Sterba
On Wed, Jul 06, 2016 at 06:16:06PM +0800, Wang Xiaoguang wrote:
> hello,
> 
> On 07/05/2016 01:10 AM, David Sterba wrote:
> > On Wed, Jun 29, 2016 at 01:12:16PM +0800, Wang Xiaoguang wrote:
> >
> > Can you please describe in more detail what is this patch fixing?
> In original dump_space_info(), free space info is calculated by
> info->total_bytes - info->bytes_used - info->bytes_pinned - 
> info->bytes_reserved - info->bytes_readonly,
> but I think free space info should also minus info->bytes_may_use :)

Not really what I expected. The change is correct but the changelog
should say something "the 'used space' formula is missing the
bytes_may_used, that is used elsewhere eg. __reserve_metadata_bytes or
space_info_add_old_bytes". That way I have something to verify during
the review.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2] btrfs: make sure device is synced before return

2016-07-08 Thread David Sterba
On Thu, Jun 23, 2016 at 08:54:07PM +0800, Anand Jain wrote:
> An inconsistent behavior due to stale reads from the
> disk was reported
> 
>   mail-archive.com/linux-btrfs@vger.kernel.org/msg54188.html
> 
> This patch will make sure devices are synced before
> return in the unmount thread.
> 
> Signed-off-by: Anand Jain 

Added to for-next.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Gabriel C
2016-07-08 14:41 GMT+02:00 Chris Mason :
>
>
> On 07/08/2016 05:57 AM, Gabriel C wrote:
>>
>> 2016-07-07 21:21 GMT+02:00 Chris Mason :
>>>
>>>
>>>
>>> On 07/07/2016 06:24 AM, Gabriel C wrote:


 Hi,

 while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
 other versions )
 I trigger the following :
>>>
>>>
>>>
>>> I definitely thought we had this fixed in v4.7-rc.  Can you easily fsck
>>> this filesystem?  Something strange is going on.
>>
>>
>> Yes , btrfs check and btrfs check  --check-data-csum are fine , no errors
>> found.
>>
>> If you want me to test any patches let me know.
>>
>
> Can you please try a v4.5 stable kernel?  I'm curious if this really is the
> same regression that I tried to fix in v4.7
>

Sure , I'll test on 4.5.7 and let you know.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: errors with linux-next-20160701

2016-07-08 Thread Chris Mason



On 07/08/2016 09:51 AM, David Sterba wrote:

On Thu, Jul 07, 2016 at 08:15:18PM +0200, Laszlo Fiat wrote:

I have a simple btrfs filesystem on a single device. It worked well so far.

Recently I compiled a new kernel linux-next-20160701, with this new
kernel I get warnings and errors in the logs.


I hope you're aware that using linux-next can lead to all sorts of
problems.



This should be easy to verify by pulling Dave's for-next branch on top 
of Linus' current git.  Either the bugs will disappear or not.


-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: errors with linux-next-20160701

2016-07-08 Thread David Sterba
On Thu, Jul 07, 2016 at 08:15:18PM +0200, Laszlo Fiat wrote:
> I have a simple btrfs filesystem on a single device. It worked well so far.
> 
> Recently I compiled a new kernel linux-next-20160701, with this new
> kernel I get warnings and errors in the logs.

I hope you're aware that using linux-next can lead to all sorts of
problems.

> But btrfs scrub
> completes with 0 errors, and if I boot back to the older
> linux-next-20160527 kernel, there are no error messages in the logs
> when using it or running scrub.

There's other involved party, device mapper so this is another factor to
be taken into account when you compare the kernels.

> The filesystem is mountable
> read-write, but I am worried about the warnings and errors. The
> checksum warnings always come up with new numbers, never the same.

This looks likes random memory overwrites, but that's just a guess.

> $ uname -a
> Linux debian 4.7.0-rc5-next-20160701+ #46 SMP Sun Jul 3 15:29:10 CEST
> 2016 x86_64 GNU/Linux
> 
> $ btrfs --version
> btrfs-progs v4.5.2
> 
> # btrfs fi show
> Label: none  uuid: d6cab9ca-5e89-4d8c-b55b-c700a6096d37
> Total devices 1 FS bytes used 55.31GiB
> devid1 size 119.23GiB used 62.07GiB path /dev/mapper/home
> 
> # btrfs fi df /home
> Data, single: total=59.01GiB, used=54.86GiB
> System, DUP: total=32.00MiB, used=16.00KiB
> Metadata, DUP: total=1.50GiB, used=462.56MiB
> GlobalReserve, single: total=88.42MiB, used=0.00B
> 
> # dmesg | grep Btrfs
> [4.530960] Btrfs loaded, crc32c=crc32c-intel
> 
> # dmesg | grep BTRFS
> [4.530968] BTRFS: selftest: sectorsize: 4096  nodesize: 4096
> [4.530973] BTRFS: selftest: sectorsize: 4096  nodesize: 8192
> [4.530978] BTRFS: selftest: sectorsize: 4096  nodesize: 16384
> [4.530982] BTRFS: selftest: sectorsize: 4096  nodesize: 32768
> [4.530986] BTRFS: selftest: sectorsize: 4096  nodesize: 65536
> [   52.114886] BTRFS: device fsid d6cab9ca-5e89-4d8c-b55b-c700a6096d37
> devid 1 transid 29825 /dev/dm-0
> [   60.598254] BTRFS info (device dm-0): use lzo compression
> [   60.598266] BTRFS info (device dm-0): disk space caching is enabled
> [   60.598273] BTRFS info (device dm-0): has skinny extents
> [   60.797475] BTRFS warning (device dm-0): dm-0 checksum verify
> failed on 343146496 wanted D962670F found 32292342 level 0
> [   60.850008] BTRFS info (device dm-0): detected SSD devices, enabling SSD 
> mode
> [  165.491150] BTRFS error (device dm-0): bad tree block start
> 8242807833012638730 658522112

A quick sanity check of the values:

8242807833012638730 == 0x7264560d4686940a

which does not look like a valid block pointer (as it's supposed to be
aligned to 4k, ie. ending with 000). This looks like the block has been
overwritten externally.

> # grep "BTRFS error" /var/log/syslog.1
> Jul  6 19:35:23 debian kernel: [ 2712.823929] BTRFS error (device
> dm-0): bad tree block start 10283429131165574676 662503424
> Jul  6 19:35:23 debian kernel: [ 2712.850468] BTRFS error (device
> dm-0): bad tree block start 14801127411347629381 663502848
> Jul  6 19:35:23 debian kernel: [ 2712.888038] BTRFS error (device
> dm-0): bad tree block start 9855282569545798023 664141824
> Jul  6 19:35:23 debian kernel: [ 2712.888491] BTRFS error (device
> dm-0): bad tree block start 12220505751590977444 664207360
> Jul  6 19:35:59 debian kernel: [ 2748.728044] BTRFS error (device
> dm-0): bad tree block start 16324583772582058537 665927680
> Jul  6 19:37:31 debian kernel: [ 2840.465648] BTRFS error (device
> dm-0): bad tree block start 13618790082605902229 309936128
> Jul  6 19:37:31 debian kernel: [ 2840.465672] BTRFS error (device
> dm-0): bad tree block start 9260130888975445835 309870592
> Jul  6 19:37:31 debian kernel: [ 2840.466526] BTRFS error (device
> dm-0): bad tree block start 17351834078110360434 309968896
> Jul  6 19:37:31 debian kernel: [ 2840.466579] BTRFS error (device
> dm-0): bad tree block start 3476538019772833052 309985280
> Jul  6 19:37:31 debian kernel: [ 2840.509021] BTRFS error (device
> dm-0): bad tree block start 1881224518785478735 327696384
> Jul  6 19:37:31 debian kernel: [ 2840.509085] BTRFS error (device
> dm-0): bad tree block start 14212257183956925500 327712768
> Jul  6 19:37:31 debian kernel: [ 2840.533393] BTRFS error (device
> dm-0): bad tree block start 13574459615317154064 331268096
> Jul  6 19:37:53 debian kernel: [ 2862.852848] BTRFS error (device
> dm-0): bad tree block start 13618790082605902229 309936128
> Jul  7 19:52:05 debian kernel: [  165.491150] BTRFS error (device
> dm-0): bad tree block start 8242807833012638730 658522112

The other 'start' values are also bogus block pointers.

I don't see an apparent cause of the errors, but this kind of reports
usually points to external factors, so I don't think it's a btrfs bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kdave/for-next commit 26112f7f472

2016-07-08 Thread David Sterba
On Fri, Jul 08, 2016 at 07:55:47AM -0400, Jeff Mahoney wrote:
> >> The problem is that btrfs_calc_reclaim_metadata_size didn't used to be
> >> called from recovery, so it was safe to use fs_info->fs_root.  With
> >> commit 7c83c6a09 (Btrfs: don't bother kicking async if there's nothing
> >> to reclaim) we do call it from recovery context and fs_info->fs_root is
> >> NULL.
> >>
> >> The fix is to just not switch btrfs_calc_reclaim_metadata_size to take
> >> an fs_info.  All the other call sites were using fs_info->fs_root
> >> anyway, so it's not like we're pinning a root somewhere just for this call.
> > 
> > I've had this patch from last October in my 4.4.x tree forever:
> > http://www.spinics.net/lists/linux-btrfs/msg48457.html
> > 
> > Apparently it fell off the table. Shouldn't that fix it?
> 
> A different fix went into for-next.

Which JFI is https://patchwork.kernel.org/patch/8928981/ so the fix from
Tsutomu Itoh is not relevant anymore (but yeah it was lost in the
noise).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/31] btrfs: simplify use of struct btrfs_root pointers

2016-07-08 Thread David Sterba
On Thu, Jul 07, 2016 at 10:19:37PM -0400, Jeff Mahoney wrote:
> On 7/7/16 9:48 PM, Jeff Mahoney wrote:
> > On 6/24/16 6:14 PM, je...@suse.com wrote:
> >> From: Jeff Mahoney 
> >>
> >> One of the common complaints I've heard from new and experienced
> >> developers alike about the btrfs code is the ubiquity of
> >> struct btrfs_root.  There is one for every tree on disk and it's not
> >> always obvious which root is needed in a particular call path.  It can
> >> be frustrating to spend time figuring out which root is required only
> >> to discover that it's not actually used for anything other than
> >> getting the fs-global struct btrfs_fs_info.
> >>
> >> The patchset contains several sections.
> >>
> >> 1) The fsid trace event patchset I posted earlier; I can rebase without 
> >> this
> >>but I'd prefer not to.
> >>
> >> 2) Converting btrfs_test_opt and friends to use an fs_info.
> >>
> >> 3) Converting tests to use an fs_info pointer whenever a root is used.
> >>
> >> 4) Moving sectorsize and nodesize to fs_info and cleaning up the
> >>macros used to access them.
> > 
> > This change was a little overzealous in free-space-cache.c, which hit
> > block_group->sectorsize as well as root->sectorsize by accident.  While
> > the change is fine for general btrfs usage, it breaks the sanity tests
> > since dummy block groups now depend on a dummy fs_info as well.
> 
> There's also another error in btrfs_alloc_dummy_fs_info that doesn't
> initialize sectorsize.
> 
> Clearly my test config got CONFIG_BTRFS_FS_RUN_SANITY_TESTS disabled at
> some point. :(

I've tried to fix that but the tests still failed so for sake of getting
for-next out, I've verified that commits up to

"btrfs: btrfs_abort_transaction, drop root parameter"

do boot so please send updates only to the following patches. My fixup
attempts are in the branch for-4.8-fixups-buggy.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Chris Mason



On 07/08/2016 05:57 AM, Gabriel C wrote:

2016-07-07 21:21 GMT+02:00 Chris Mason :



On 07/07/2016 06:24 AM, Gabriel C wrote:


Hi,

while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
other versions )
I trigger the following :



I definitely thought we had this fixed in v4.7-rc.  Can you easily fsck this 
filesystem?  Something strange is going on.


Yes , btrfs check and btrfs check  --check-data-csum are fine , no errors found.

If you want me to test any patches let me know.



Can you please try a v4.5 stable kernel?  I'm curious if this really is 
the same regression that I tried to fix in v4.7


-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kdave/for-next commit 26112f7f472

2016-07-08 Thread Holger Hoffstätte
On 07/08/16 13:55, Jeff Mahoney wrote:
> On 7/8/16 7:19 AM, Holger Hoffstätte wrote:
>> On 07/08/16 06:24, Jeff Mahoney wrote:
>>> Hi Dave -
>>>
>>> This commit introduces a bug.  I ran across it when running xfstests
>>> against my own integrated branch.
>>
>> I can't find that commit id anywhere...?
> 
> Hi Holger -
> 
> This is the for-next branch.  It's not in any mainline branch yet.

Yes, I understand that. I searched in the github/kdave tree, which only
has the for-next-xyz branches. Found it in the one on kernel.org.

-h




signature.asc
Description: OpenPGP digital signature


Re: 64-btrfs.rules and degraded boot

2016-07-08 Thread Austin S. Hemmelgarn

On 2016-07-07 16:20, Chris Murphy wrote:

On Thu, Jul 7, 2016 at 1:59 PM, Austin S. Hemmelgarn
 wrote:


D-Bus support needs to be optional, period.  Not everybody uses D-Bus (I
have dozens of systems that get by just fine without it, and know hundreds
of other people who do as well), and even people who do don't always use
every tool needed (on the one system I manage that does have it, the only
things I need it for are Avahi, ConsoleKit, udev, and NetworkManager, and
I'm getting pretty close to the point of getting rid of NM and CK and
re-implementing or forking Avahi).  You have to consider the fact that there
are and always will be people who do not install a GUI on their system and
want the absolute minimum of software installed.


That's fine, they can monitor kernel messages directly as their
notification system. I'm concerned with people who don't ever look at
kernel messages, you know, mortal users who have better things to do
with a computer than that. It's important for most anyone to not have
to wait for problems to manifest traumatically.
My point is that they probably need btrfs-progs too.  Take me for 
example, I don't use some fancy graphical tool to tell me when my disks 
are failing, but I don't scrape kernel logs either.  I have things set 
up to monitor the disks directly (using btrfs-progs in the case of stuff 
that can check for), and notify me via e-mail if there's an issue.  Not 
supporting that use case at all would be like e2fsprogs adding a 
dependency on X11 and telling everyone who doesn't want to use X11 to 
just go implement their own tools.  If that happened, e2fsprogs would 
get forked, the commit reverted in that fork, and most of the 
non-enterprise distros would probably switch pretty damn quick to the 
forked version.




Personally, I don't care what Fedora is doing, or even what GNOME (or any
other DE for that matter, the only reason I use Xfce is because some things
need a GUI (many of them unnecessarily), and that happens to be the DE I
have the fewest complaints about) is doing.  The only reason that things
like GNOME Disks and such exist is because they're trying to imitate Windows
and OS X, which is all well and good for a desktop, but is absolute crap for
many server and embedded environments (Microsoft finally realized this, and
Windows Server 2012 added the ability to install without a full desktop,
which actually means that they have _more_ options than a number of Linux
distributions (yes you can rip out the desktop on many distros if you want,
but that takes an insane amount of effort most of the time, not to mention
storage space)).


I'm willing to bet dollars to donuts Xfce fans would love to know if
one of their rootfs mirrors is spewing read errors, while smartd
defers to the drive which says "hey no problems here". GNOME at least
does report certain critical smart errors, but that still leaves
something like 40% of drive failures happening without prior notice.
I'm not saying some specific users don't care, I'm saying that requiring 
people to have a specific software stack which may not work for their 
use case is a stupid choice for something as low level as this.  Yes 
people want to know when something failed, but we shouldn't mandate 
_how_ they choose in a given system to check this.  There need to be 
more choices than just a GUI tool and talking directly to the kernel. 
Looking at this another way, it is fully possible to implement something 
to do this in a DE agnostic manner _without depending on D-BUS_ using 
the tools as they are right now.  An initial implementation would of 
course be inefficient, but until we get notifications _from the kernel_ 
about FS state, we have to poll regardless, which means that having 
D-Bus support would not help (and would probably just make things slower).




Storaged also qualifies as something that _needs_ to be optional, especially
because it appears to require systemd (and it falls into the same category
as D-Bus of 'unnecessary bloat on many systems').  Adding a mandatory
dependency on systemd _will_ split the community and severely piss off quite
a few people (you will likely get some rather nasty looks from a number of
senior kernel developers if you meet them in person).


I just want things to work for users, defined as people who would like
to stop depending on Windows and macOS for both server and desktop
usage. I don't really care about ideological issues outside of that
goal.
Making us hard depend on storaged would not help this goal.  It's no 
different than the Microsoft and Apple approach of 'our way or not at all'.


To clarify, I'm not trying to argue against adding support, I'm arguing 
against it being mandatory.  A filesystem which requires specific system 
services to be running just for regular maintenance tasks is not a well 
designed filesystem.  To be entirely honest, I'm not all that happy 
about the functional dependency on udev to have device discovery, 

Re: raid1 has failing disks, but smart is clear

2016-07-08 Thread Austin S. Hemmelgarn

On 2016-07-08 07:14, Tomasz Kusmierz wrote:


Well, I was able to run memtest on the system last night, that passed with
flying colors, so I'm now leaning toward the problem being in the sas card.
But I'll have to run some more tests.



Seriously use the "stres.sh" for couple of days, When I was running
memtest it was running continuously for 3 days without the error, day
of stres.sh and errors started showing up.
Be VERY careful with trusting any sort of that tool, modern CPU's lye
to you continuously !!!
1. You may think that you've wrote best on the planet code that
bypasses a CPU cache, but in reality since CPU's are multicore you can
end up with overzealous MPMD traping you inside of you cache memory
and all you resting will do is write a page (trapped in cache) read it
from cache (coherency mechanism, not the mis/hit one) will trap you
inside of L3 so you have no clue you don't touch the ram, then CPU
will just dump your page to RAM and "job done"
2. Since coherency problems and real problems with non blocking on
mpmd you can have a DMA controller sucking pages out your own cache,
due to ram being marked as dirty and CPU will try to spare the time
and accelerate the operation to push DMA straigh out of L3 to
somewhere else (mentioning that sine some testers use crazy way of
forcing your ram access via DMA to somewhere and back to force droping
out of L3)
3. This one is actually funny: some testers didn't claim the pages to
the process so for some reason pages that the were using were not
showing up as used / dirty etc so all the testing was done 32kB of L1
... tests were fast thou :)

srters.sh will test operation of the whole system !!! it shifts a lot
of data so disks are engaged, CPU keeps pumping out CRC32 all the time
so it's busy, RAM gets hit nicely as well due to high DMA.

Agreed, never just trust memtest86 or memtest86+.

FWIW< here's the routine I go through to test new RAM:
1. Run regular memtest86 for at least 3 full cycles in full SMP mode (F2 
while starting up to force SMP).  On some systems this may hang, but 
that's an issue in the BIOS's setup of the CPU and MC, not the RAM, and 
is generally not indicative of a system which will have issues.
2. Run regular memtest86 for at least 3 full cycles in regular UP mode 
(the default on most non-NUMA hardware).
3. Repeat 1 and 2 with memtest86+.  It's diverged enough from regular 
memtest86 that it's functionally a separate tool, and I've seen RAM that 
passes one but not the other on multiple occasions before.
4. Boot SystemRescueCD, download a copy of the Linux sources, and run as 
many allmodconfig builds in parallel as I have CPU's, each with a number 
of make jobs equal to the twice number of CPU's (so each CPU ends up 
running at least two threads).  This forces enough context switching to 
completely trash even the L3 cache on almost any modern processor, which 
means it forces things out to RAM.  It won't hit all your RAM, but I've 
found it to be a relatively reliable way to verify the memory bus and 
the memory controller work properly.
5. Still from SystemRescueCD, use a tool called memtester (essentially 
memtest86, but run from userspace) to check the RAM.
6. Still from SystemRescueCD, use sha1sum to compute SHA-1 hashes of all 
the disks in the system, using at least 8 instances of sha1sum per CPU 
core, and make sure that all the sums for a disk match.
7. Do 6 again, but using cat to compute the sum of a concatenation of 
all the disks in the system (so the individual commands end up being 
`cat /dev/sd? | sha1sum`).  This will rapidly use all available memory 
on the system and keep it in use for quite a while.
8. If I'm using my home server system, I also have a special virtual 
runlevel set up where I spin up 4 times as many VM's as I have CPU cores 
(so on my current 8 core system, I spin up 32), all assigned a part of 
the RAM not used by the host (which I shrink to the minimum useable size 
of about 500MB), all running steps 1-3 in parallel.


It may also be worth mentioning that I've seen very poorly behaved HBA's 
that produce symptoms that look like bad RAM, including issues not 
related to the disks themselves, yet show no issues when regular memory 
testing is run.


When come to think about it, if your device points change during
operation of the system it might be an LSI card dying -> reinitialize
-> rediscovering drives -> drives show up in different point. On my
system I can hot swap sata and it will come up with different dev even
thou it was connected to same place on the controller.
Barring a few odd controllers I've seen which support hot-plug but not 
hot-remove, that shouldn't happen unless the device is in use, and in 
that case it only happens because of the existing open references to the 
device being held by whatever is using it.


I think, most important - I presume you run nonECC ?
And if not, how well shielded is your system?  You can often get by with 
non-ECC RAM if you have good EMI 

Re: kdave/for-next commit 26112f7f472

2016-07-08 Thread Jeff Mahoney
On 7/8/16 7:19 AM, Holger Hoffstätte wrote:
> On 07/08/16 06:24, Jeff Mahoney wrote:
>> Hi Dave -
>>
>> This commit introduces a bug.  I ran across it when running xfstests
>> against my own integrated branch.
> 
> I can't find that commit id anywhere...?

Hi Holger -

This is the for-next branch.  It's not in any mainline branch yet.

>> The problem is that btrfs_calc_reclaim_metadata_size didn't used to be
>> called from recovery, so it was safe to use fs_info->fs_root.  With
>> commit 7c83c6a09 (Btrfs: don't bother kicking async if there's nothing
>> to reclaim) we do call it from recovery context and fs_info->fs_root is
>> NULL.
>>
>> The fix is to just not switch btrfs_calc_reclaim_metadata_size to take
>> an fs_info.  All the other call sites were using fs_info->fs_root
>> anyway, so it's not like we're pinning a root somewhere just for this call.
> 
> I've had this patch from last October in my 4.4.x tree forever:
> http://www.spinics.net/lists/linux-btrfs/msg48457.html
> 
> Apparently it fell off the table. Shouldn't that fix it?

A different fix went into for-next.  That's where the conflict is.  The
merged version of my root->fs_info patch reverts it.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: kdave/for-next commit 26112f7f472

2016-07-08 Thread Holger Hoffstätte
On 07/08/16 06:24, Jeff Mahoney wrote:
> Hi Dave -
> 
> This commit introduces a bug.  I ran across it when running xfstests
> against my own integrated branch.

I can't find that commit id anywhere...?

> The problem is that btrfs_calc_reclaim_metadata_size didn't used to be
> called from recovery, so it was safe to use fs_info->fs_root.  With
> commit 7c83c6a09 (Btrfs: don't bother kicking async if there's nothing
> to reclaim) we do call it from recovery context and fs_info->fs_root is
> NULL.
> 
> The fix is to just not switch btrfs_calc_reclaim_metadata_size to take
> an fs_info.  All the other call sites were using fs_info->fs_root
> anyway, so it's not like we're pinning a root somewhere just for this call.

I've had this patch from last October in my 4.4.x tree forever:
http://www.spinics.net/lists/linux-btrfs/msg48457.html

Apparently it fell off the table. Shouldn't that fix it?

-h




signature.asc
Description: OpenPGP digital signature


Re: [PATCH 05/31] btrfs: tests, require fs_info for root

2016-07-08 Thread David Sterba
On Thu, Jul 07, 2016 at 09:32:37PM -0400, Jeff Mahoney wrote:
> On 6/24/16 6:14 PM, je...@suse.com wrote:
> > From: Jeff Mahoney 
> > 
> > This allows the upcoming patchset to push nodesize and sectorsize into
> > fs_info.
> > 
> > Signed-off-by: Jeff Mahoney 
> > ---
> >  fs/btrfs/ctree.h   |  1 +
> >  fs/btrfs/disk-io.c | 15 +++
> >  fs/btrfs/disk-io.h |  3 ++-
> >  fs/btrfs/tests/btrfs-tests.c   | 20 ---
> >  fs/btrfs/tests/btrfs-tests.h   |  1 +
> >  fs/btrfs/tests/extent-buffer-tests.c   | 23 +++--
> >  fs/btrfs/tests/free-space-tests.c  | 14 +++
> >  fs/btrfs/tests/free-space-tree-tests.c | 18 +++--
> >  fs/btrfs/tests/inode-tests.c   | 46 
> > ++
> >  fs/btrfs/tests/qgroup-tests.c  | 23 +
> >  10 files changed, 103 insertions(+), 61 deletions(-)
> > 
> > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> > index 100d2ea..4781057 100644
> > --- a/fs/btrfs/ctree.h
> > +++ b/fs/btrfs/ctree.h
> > @@ -117,6 +117,7 @@ static inline unsigned long btrfs_chunk_item_size(int 
> > num_stripes)
> >  #define BTRFS_FS_STATE_REMOUNTING  1
> >  #define BTRFS_FS_STATE_TRANS_ABORTED   2
> >  #define BTRFS_FS_STATE_DEV_REPLACING   3
> > +#define BTRFS_FS_STATE_DUMMY_FS_INFO   4
> >  
> >  #define BTRFS_BACKREF_REV_MAX  256
> >  #define BTRFS_BACKREF_REV_SHIFT56
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index 8f27127..418163d 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -1233,6 +1233,7 @@ static void __setup_root(u32 nodesize, u32 
> > sectorsize, u32 stripesize,
> >  struct btrfs_root *root, struct btrfs_fs_info *fs_info,
> >  u64 objectid)
> >  {
> > +   bool dummy = test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, _info->fs_state);
> > root->node = NULL;
> > root->commit_root = NULL;
> > root->sectorsize = sectorsize;
> > @@ -1287,14 +1288,14 @@ static void __setup_root(u32 nodesize, u32 
> > sectorsize, u32 stripesize,
> > root->log_transid = 0;
> > root->log_transid_committed = -1;
> > root->last_log_commit = 0;
> > -   if (fs_info)
> > +   if (dummy)
> 
> This should be:
> if (!dummy)
> 
> > extent_io_tree_init(>dirty_log_pages,
> >  fs_info->btree_inode->i_mapping);
> >  
> > memset(>root_key, 0, sizeof(root->root_key));
> > memset(>root_item, 0, sizeof(root->root_item));
> > memset(>defrag_progress, 0, sizeof(root->defrag_progress));
> > -   if (fs_info)
> > +   if (dummy)
> 
> So should this.

Updated in the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-08 Thread Tomasz Kusmierz
>
> Well, I was able to run memtest on the system last night, that passed with
> flying colors, so I'm now leaning toward the problem being in the sas card.
> But I'll have to run some more tests.
>

Seriously use the "stres.sh" for couple of days, When I was running
memtest it was running continuously for 3 days without the error, day
of stres.sh and errors started showing up.
Be VERY careful with trusting any sort of that tool, modern CPU's lye
to you continuously !!!
1. You may think that you've wrote best on the planet code that
bypasses a CPU cache, but in reality since CPU's are multicore you can
end up with overzealous MPMD traping you inside of you cache memory
and all you resting will do is write a page (trapped in cache) read it
from cache (coherency mechanism, not the mis/hit one) will trap you
inside of L3 so you have no clue you don't touch the ram, then CPU
will just dump your page to RAM and "job done"
2. Since coherency problems and real problems with non blocking on
mpmd you can have a DMA controller sucking pages out your own cache,
due to ram being marked as dirty and CPU will try to spare the time
and accelerate the operation to push DMA straigh out of L3 to
somewhere else (mentioning that sine some testers use crazy way of
forcing your ram access via DMA to somewhere and back to force droping
out of L3)
3. This one is actually funny: some testers didn't claim the pages to
the process so for some reason pages that the were using were not
showing up as used / dirty etc so all the testing was done 32kB of L1
... tests were fast thou :)

srters.sh will test operation of the whole system !!! it shifts a lot
of data so disks are engaged, CPU keeps pumping out CRC32 all the time
so it's busy, RAM gets hit nicely as well due to high DMA.

When come to think about it, if your device points change during
operation of the system it might be an LSI card dying -> reinitialize
-> rediscovering drives -> drives show up in different point. On my
system I can hot swap sata and it will come up with different dev even
thou it was connected to same place on the controller.

I think, most important - I presume you run nonECC ?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: tests: 006-image-on-missing-device: fix btrfs tool path

2016-07-08 Thread David Sterba
On Mon, Jul 04, 2016 at 11:48:47PM +0100, Luis Henriques wrote:
> If btrfs isn't in the path, this test will fail with:
> 
> [TEST/misc]   006-image-on-missing-device
> failed: btrfs fi show /dev/loop0
> test failed for case 006-image-on-missing-device
> Makefile:226: recipe for target 'test-misc' failed
> make: *** [test-misc] Error 1
> 
> Fix the test script by adding $TOP to the path.
> 
> Signed-off-by: Luis Henriques 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-08 Thread Gabriel C
2016-07-07 21:21 GMT+02:00 Chris Mason :
>
>
> On 07/07/2016 06:24 AM, Gabriel C wrote:
>>
>> Hi,
>>
>> while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
>> other versions )
>> I trigger the following :
>
>
> I definitely thought we had this fixed in v4.7-rc.  Can you easily fsck this 
> filesystem?  Something strange is going on.

Yes , btrfs check and btrfs check  --check-data-csum are fine , no errors found.

If you want me to test any patches let me know.


Regards,

Gabriel C
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: dedupe-inband enable/reconfigure: force option does not take argument

2016-07-08 Thread Satoru Takeuchi
---
This patch can be applied to integration-20160704(2355a7e5dcdf122d1924)
---
 cmds-dedupe-ib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-dedupe-ib.c b/cmds-dedupe-ib.c
index 342784c..dbb30ab 100644
--- a/cmds-dedupe-ib.c
+++ b/cmds-dedupe-ib.c
@@ -132,7 +132,7 @@ static int enable_reconfig_dedupe(int argc, char **argv, 
int reconf)
{ "hash-algorithm", required_argument, NULL, 'a'},
{ "limit-hash", required_argument, NULL, 'l'},
{ "limit-memory", required_argument, NULL, 'm'},
-   { "force", required_argument, NULL, 'f'},
+   { "force", no_argument, NULL, 'f'},
{ NULL, 0, NULL, 0}
};

-- 
2.5.5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html