Re: BUG: scheduling while atomic

2018-08-10 Thread James Courtier-Dutton
On 6 August 2018 at 07:26, Qu Wenruo  wrote:
>
>
>> WARNING: CPU: 3 PID: 803 at
>> /build/linux-hwe-SYRsgd/linux-hwe-4.15.0/fs/btrfs/extent_map.c:77
>> free_extent_map+0x78/0x90 [btrfs]
>
> Then it makes sense, as it's a WARN_ON() line, showing one extent map is
> still used.
>
> If it get freed, it will definitely cause some rbtree corruption.
>
>
> It's should be the only free_extent_map() call in __do_readpage() function.
> However a first glance into this function shows nothing wrong, nor new
> modification in this function.
> (Maybe it's the get_extent() hook?)
>
> Is there any reproducer? Or special workload?
The workload is fairly simple.
1) The server is receiving 1Gbyte files from across the network, in 10
minute intervals, and storing them on the HDD.
2) A process reads the files, scanning them for certain patterns.
3) A cron job deletes the old files.



>
> And, have you tried "btrfs check --readonly "? If there is any
> error it would help a lot to locate the problem.
>
root@sandisk:~# btrfs check --readonly /dev/sda3
Checking filesystem on /dev/sda3
UUID: 8c9063b9-a4bb-48d1-92ba-6adf49af6fb5
checking extents
checking free space cache
block group 6874953940992 has wrong amount of free space
failed to load free space cache for block group 6874953940992
checking fs roots
checking csums
checking root refs
found 1488143566396 bytes used err is 0
total csum bytes: 1448276084
total tree bytes: 5079711744
total fs tree bytes: 3280207872
total extent tree bytes: 146997248
btree space waste bytes: 1100557047
file data blocks allocated: 2266345996288
 referenced 2235075653632
root@sandisk:~#


So, not much to see there.
Any more ideas?


BUG: scheduling while atomic

2018-08-05 Thread James Courtier-Dutton
I am seeing a server halt and require a manual restart that I think
might be related to btrfs.
I attach the kernel log from it, in the hope that someone will
understand it better than me.
Any clues?

https://paste.fedoraproject.org/paste/xSblK1RKANiwhKHQj31Cdw


Kind Regards

James
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to zero errors

2018-04-10 Thread James Courtier-Dutton
Hi,

I have  disk that in the past had errors on it.
I have fixed up the errors.
btrfs scrub now reports no errors.
How do I reset these counters to zero?

BTRFS info (device sdc2): bdev /dev/sdc2 errs: wr 0, rd 35, flush 0,
corrupt 1, gen 0


Kind Regards


James
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2018-01-06 Thread James Courtier-Dutton
On 28 December 2017 at 00:39, Duncan <1i5t5.dun...@cox.net> wrote:
>
> AFAIK, ionice only works for some IO schedulers, not all.  It does work
> with the default CFQ scheduler, but I don't /believe/ it works with
> deadline, certainly not with noop, and I'd /guess/ it doesn't work with
> block-multiqueue (and thus not with bfq or kyber) at all, tho it's
> possible it does in the latest kernels, since multi-queue is targeted to
> eventually replace, at least as default, the older single-queue options.
>
> So which scheduler are you using and are you on multi-queue or not?
>

Thank you. The install had defaulted to deadline.
I have now switched it to CFQ, and the system is much more
responsive/interactive now during a btrfs balance.

I will test it when I next get a chance, to see if that has helped me.
After reading about it:
deadline:  more likely to complete long sequential reads/writes and
not switch tasks.Thus reducing the amount of seeking but impacting
concurrent tasks.
cfq: more likely to break up long sequential reads/writes to permit
other tasks to do some work. Thus increasing the amount of seeking but
helping concurrent tasks.

This would explain why "cfq" is best for me.
I have not yet looked at "multi-queue".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2017-12-27 Thread James Courtier-Dutton
Hi,

Thank you for your suggestion.
It does not help at all.
btrfs balance's behaviour seems to be unchanged by ionice.
It still takes 100% while working and starves all other processes of
disk access.

I can I get btrfs balance to work in the background, without adversely
affecting other applications?

>
> On 23 December 2017 at 11:56, Alberto Bursi <alberto.bu...@outlook.it> wrote:
>>
>>
>> On 12/23/2017 12:19 PM, James Courtier-Dutton wrote:
>>> Hi,
>>>
>>> During a btrfs balance, the process hogs all CPU.
>>> Or, to be exact, any other program that wishes to use the SSD during a
>>> btrfs balance is blocked for long periods. Long periods being more
>>> than 5 seconds.
>>> Is there any way to multiplex SSD access while btrfs balance is
>>> operating, so that other applications can still access the SSD with
>>> relatively low latency?
>>>
>>> My guess is that btrfs is doing a transaction with a large number of
>>> SSD blocks at a time, and thus blocking other applications.
>>>
>>> This makes for atrocious user interactivity as well as applications
>>> failing because they cannot access the disk in a relatively low latent
>>> manner.
>>> For, example, this is causing a High Definition network CCTV
>>> application to fail.
>>>
>>> What I would really like, is for some way to limit SSD bandwidths to
>>> applications.
>>> For example the CCTV app always gets the bandwidth it needs, and all
>>> other applications can still access the SSD, but are rate limited.
>>> This would fix my particular problem.
>>> We have rate limiting for network applications, why not disk access also?
>>>
>>> Kind Regards
>>>
>>> James
>>>
>>
>> On most I/O intensive programs in Linux you can use "ionice" tool to
>> change the disk access priority of a process. [1]
>> This allows me to run I/O intensive background scripts in servers
>> without the users noticing slowdowns or lagging, of course this means
>> the process doing heavy I/O will run more slowly or get outright paused
>> if higher-priority processes need a lot of access to the disk.
>>
>> It works on btrfs balance too, see (commandline example) [2].
>>
>> If you don't start the process with ionice as in [2], you can always
>> change the priority later if you get the get the process ID. I use iotop
>> [3], which also supports commandline arguments to integrate its output
>> in scripts.
>>
>> For btrfs scrub it seems to be possible to specify the ionice options
>> directly, while btrfs balance does not seem to have them (would be nice
>> to add them imho). [4]
>>
>> For the sake of completeness, there is also "nice" tool for CPU usage
>> priority (also used in my scripts on servers to keep the scripts from
>> hogging the CPU for what is just a background process, and seen in [2]
>> commandline too). [5]
>>
>> 1. http://man7.org/linux/man-pages/man1/ionice.1.html
>> 2.
>> https://unix.stackexchange.com/questions/390480/nice-and-ionice-which-one-should-come-first
>> 3. http://man7.org/linux/man-pages/man8/iotop.8.html
>> 4. https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-scrub
>> 5. http://man7.org/linux/man-pages/man1/nice.1.html
>>
>> -Alberto
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs balance problems

2017-12-23 Thread James Courtier-Dutton
Hi,

During a btrfs balance, the process hogs all CPU.
Or, to be exact, any other program that wishes to use the SSD during a
btrfs balance is blocked for long periods. Long periods being more
than 5 seconds.
Is there any way to multiplex SSD access while btrfs balance is
operating, so that other applications can still access the SSD with
relatively low latency?

My guess is that btrfs is doing a transaction with a large number of
SSD blocks at a time, and thus blocking other applications.

This makes for atrocious user interactivity as well as applications
failing because they cannot access the disk in a relatively low latent
manner.
For, example, this is causing a High Definition network CCTV
application to fail.

What I would really like, is for some way to limit SSD bandwidths to
applications.
For example the CCTV app always gets the bandwidth it needs, and all
other applications can still access the SSD, but are rate limited.
This would fix my particular problem.
We have rate limiting for network applications, why not disk access also?

Kind Regards

James
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: do not mount when we have a sectorsize larger than PAGE_SIZE

2012-04-04 Thread James Courtier-Dutton
On 3 April 2012 02:15, Liu Bo liubo2...@cn.fujitsu.com wrote:
 On 04/02/2012 08:17 PM, David Sterba wrote:
 On Mon, Apr 02, 2012 at 07:28:18PM +0800, Liu Bo wrote:
 --- a/fs/btrfs/disk-io.c
 +++ b/fs/btrfs/disk-io.c
 @@ -2104,6 +2104,14 @@ int open_ctree(struct super_block *sb,
              err = -EINVAL;
              goto fail_alloc;
      }
 +    if (btrfs_super_sectorsize(disk_super)  PAGE_CACHE_SIZE) {
 +            printk(KERN_ERR BTRFS: couldn't mount because sectorsize(%d)
 +                    was larger than PAGE_SIZE(%lu)\n,

 %llu


 err, thanks for caching it.

 +                   btrfs_super_sectorsize(disk_super),
 +                   (unsigned long long)PAGE_CACHE_SIZE);
 +            err = -EINVAL;
 +            goto fail_alloc;
 +    }

      features = btrfs_super_incompat_flags(disk_super);
      features |= BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF;

 We have the opposite check a few lines below

 2257         if (sectorsize  PAGE_SIZE) {
 2258                 printk(KERN_WARNING btrfs: Incompatible sector size 
 2259                        found on %s\n, sb-s_id);
 2260                 goto fail_sb_buffer;
 2261         }
 2262

 so sectorsize must be equal to PAGE_SIZE always and one check can catch
 both cases.


 But this check is _useless_ when we have a sectorsize which is larger than 
 PAGE_SIZE,
 we're not ready for that, too.

 We already have one check, so I'll modify this instead. :)


Minor observation.
One is PAGE_SIZE and one is PAGE_CACHE_SIZE.
Might those be different?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs and backups

2012-03-26 Thread James Courtier-Dutton
Hi,

I have a local btrfs file system with various sub-volumes that have
had snapshots done on them.

Is there some tool like rsync that I could copy all the data and
snapshots to a backup system, but still only use the same amount of
space as the source filesystem.
I see a problem being getting a consistent and steady state during the rsync.
I was thinking that I might be able to do this with LVM snapshots, but
that would require something along these lines:
1) pause the btrfs filesystem into a consistent state that can be
mounted cleanly
2) Do LVM snapshot on it.
3) un-pause btrfs filesystem.

I can then do a block level backup of the LVM snapshot and it should
be mountable on the backup server.
So, the snapshot is not a snapshot of the current filesystem, it is a
snapshot of all the snapshots and all the sub-volumes at a particular
time, that is in a stable state to be backed up.

I don't know if 1 is supported?
I suppose I am hoping for 1,2,3 to already be supported by some
special btrfs command.


Any ideas?

Kind Regards

James
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html