Re: btrfs balance problems

2018-01-06 Thread James Courtier-Dutton
On 28 December 2017 at 00:39, Duncan <1i5t5.dun...@cox.net> wrote:
>
> AFAIK, ionice only works for some IO schedulers, not all.  It does work
> with the default CFQ scheduler, but I don't /believe/ it works with
> deadline, certainly not with noop, and I'd /guess/ it doesn't work with
> block-multiqueue (and thus not with bfq or kyber) at all, tho it's
> possible it does in the latest kernels, since multi-queue is targeted to
> eventually replace, at least as default, the older single-queue options.
>
> So which scheduler are you using and are you on multi-queue or not?
>

Thank you. The install had defaulted to deadline.
I have now switched it to CFQ, and the system is much more
responsive/interactive now during a btrfs balance.

I will test it when I next get a chance, to see if that has helped me.
After reading about it:
deadline:  more likely to complete long sequential reads/writes and
not switch tasks.Thus reducing the amount of seeking but impacting
concurrent tasks.
cfq: more likely to break up long sequential reads/writes to permit
other tasks to do some work. Thus increasing the amount of seeking but
helping concurrent tasks.

This would explain why "cfq" is best for me.
I have not yet looked at "multi-queue".
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2017-12-29 Thread Hans van Kranenburg
On 12/28/2017 12:15 PM, Nikolay Borisov wrote:
> 
> On 23.12.2017 13:19, James Courtier-Dutton wrote:
>>
>> During a btrfs balance, the process hogs all CPU.
>> Or, to be exact, any other program that wishes to use the SSD during a
>> btrfs balance is blocked for long periods. Long periods being more
>> than 5 seconds.
>> Is there any way to multiplex SSD access while btrfs balance is
>> operating, so that other applications can still access the SSD with
>> relatively low latency?
>>
>> My guess is that btrfs is doing a transaction with a large number of
>> SSD blocks at a time, and thus blocking other applications.
>>
>> This makes for atrocious user interactivity as well as applications
>> failing because they cannot access the disk in a relatively low latent
>> manner.
>> For, example, this is causing a High Definition network CCTV
>> application to fail.
>>
>> What I would really like, is for some way to limit SSD bandwidths to
>> applications.
>> For example the CCTV app always gets the bandwidth it needs, and all
>> other applications can still access the SSD, but are rate limited.
>> This would fix my particular problem.
>> We have rate limiting for network applications, why not disk access also?
> 
> So how are you running btrfs balance?

Or, to again take one step further back...

*Why* are you running btrfs balance at all?

:)

> Are you using any filters
> whatsoever? The documentation
> [https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-balance] has the
> following warning:
> 
> Warning: running balance without filters will take a lot of time as it
> basically rewrites the entire filesystem and needs to update all block
> pointers.

-- 
Hans van Kranenburg
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2017-12-29 Thread Kai Krakow
Am Thu, 28 Dec 2017 00:39:37 + schrieb Duncan:

>> I can I get btrfs balance to work in the background, without adversely
>> affecting other applications?
> 
> I'd actually suggest a different strategy.
> 
> What I did here way back when I was still on reiserfs on spinning rust,
> where it made more difference than on ssd, but I kept the settings when
> I switched to ssd and btrfs, and at least some others have mentioned
> that similar settings helped them on btrfs as well, is...
> 
> Problem: The kernel virtual-memory subsystem's writeback cache was
> originally configured for systems with well under a Gigabyte of RAM, and
> the defaults no longer work so well on multi-GiB-RAM systems,
> particularly above 8 GiB RAM, because they are based on a percentage of
> available RAM, and will typically let several GiB of dirty writeback
> cache accumulate before kicking off any attempt to actually write it to
> storage.  On spinning rust, when writeback /does/ finally kickoff, this
> can result in hogging the IO for well over half a minute at a time,
> where 30 seconds also happens to be the default "flush it anyway" time.

This is somehow like the buffer bloat discussion for networking... Big 
buffers increase latency. There is more than one type of buffer.

Additionally to what Duncan wrote (first type of buffer), the kernel 
lately got a new option to fight this "buffer bloat": writeback-
throttling. It may help to enable that option.

The second type of buffer is the io queue.

So, you may also want to lower the io queue depth (nr_requests) of your 
devices. I think it defaults to 128 while most consumer drives only have 
a queue depth of 31 or 32 commands. Thus, reducing nr_requests for some 
of your devices may help you achieve better latency (but reduces 
throughput).

Especially if working with io schedulers that do not implement io 
priorities, you could simply lower nr_requests to around or below the 
native command queue depth of your devices. The device itself can handle 
it better in that case, especially on spinning rust, as the firmware 
knows when to pull certain selected commands from the queue during a 
rotation of the media. The kernel knows nothing about rotary positions, 
it can only use the queue to prioritize and reorder requests but cannot 
take advantage of rotary positions of the heads.

See

$ grep ^ /sys/block/*/queue/nr_requests


You may also get better results with increasing the nr_requests instead 
but at the cost of also adjusting the write buffer sizes, because with 
large nr_requests, you don't want blocking on writes so early, at least 
not when you need good latency. This probably works best for you with 
schedulers that care about latency, like deadline or kyber.

For testing, keep in mind that everything works in dependence to each 
other setting. So change one at a time, take your tests, then change 
another and see how that relates to the first change, even when the first 
change made your experience worse.

Another tip that's missing: Put different access classes onto different 
devices. That is, if you have a directory structure that's mostly written 
to, put it on it's own physical devices, with separate tuning and 
appropriate filesystem (log structured and cow filesystems are good at 
streaming writes). Put read mostly workloads also on their own device and 
filesystems. Put realtime workloads on their own device and filesystems. 
This gives you a much easier chance to succeed.


-- 
Regards,
Kai

Replies to list-only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2017-12-28 Thread Nikolay Borisov


On 23.12.2017 13:19, James Courtier-Dutton wrote:
> Hi,
> 
> During a btrfs balance, the process hogs all CPU.
> Or, to be exact, any other program that wishes to use the SSD during a
> btrfs balance is blocked for long periods. Long periods being more
> than 5 seconds.
> Is there any way to multiplex SSD access while btrfs balance is
> operating, so that other applications can still access the SSD with
> relatively low latency?
> 
> My guess is that btrfs is doing a transaction with a large number of
> SSD blocks at a time, and thus blocking other applications.
> 
> This makes for atrocious user interactivity as well as applications
> failing because they cannot access the disk in a relatively low latent
> manner.
> For, example, this is causing a High Definition network CCTV
> application to fail.
> 
> What I would really like, is for some way to limit SSD bandwidths to
> applications.
> For example the CCTV app always gets the bandwidth it needs, and all
> other applications can still access the SSD, but are rate limited.
> This would fix my particular problem.
> We have rate limiting for network applications, why not disk access also?

So how are you running btrfs balance? Are you using any filters
whatsoever? The documentation
[https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-balance] has the
following warning:

Warning: running balance without filters will take a lot of time as it
basically rewrites the entire filesystem and needs to update all block
pointers.


> 
> Kind Regards
> 
> James
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2017-12-27 Thread Duncan
James Courtier-Dutton posted on Wed, 27 Dec 2017 21:39:30 + as
excerpted:

> Thank you for your suggestion.

Please put your reply in standard list quote/reply-in-context order.  It 
makes further replies, /in/ /context/, far easier.  I've moved the rest 
of your reply to do that, but I shouldn't have to...

>> On 23 December 2017 at 11:56, Alberto Bursi 
>> wrote:
>>>
>>> On 12/23/2017 12:19 PM, James Courtier-Dutton wrote:

 During a btrfs balance, the process hogs all CPU.
 Or, to be exact, any other program that wishes to use the SSD during
 a btrfs balance is blocked for long periods. Long periods being more
 than 5 seconds.

Blocking disk access isn't hogging the CPU, it's hogging the disk IO.

Tho FWIW we don't have many complaints about btrfs hogging /ssd/ 
access[1], tho we do have some complaining about problems on legacy 
spinning-rust.

 Is there any way to multiplex SSD access while btrfs balance is
 operating, so that other applications can still access the SSD with
 relatively low latency?

 My guess is that btrfs is doing a transaction with a large number of
 SSD blocks at a time, and thus blocking other applications.

 This makes for atrocious user interactivity as well as applications
 failing because they cannot access the disk in a relatively low
 latent manner.
 For, example, this is causing a High Definition network CCTV
 application to fail.

That sort of low-latency is outside my own use-case, but I do have some 
suggestions...

 What I would really like, is for some way to limit SSD bandwidths to
 applications.
 For example the CCTV app always gets the bandwidth it needs, and all
 other applications can still access the SSD, but are rate limited.
 This would fix my particular problem.
 We have rate limiting for network applications, why not disk access
 also?

>>> On most I/O intensive programs in Linux you can use "ionice" tool to
>>> change the disk access priority of a process. [1]

AFAIK, ionice only works for some IO schedulers, not all.  It does work 
with the default CFQ scheduler, but I don't /believe/ it works with 
deadline, certainly not with noop, and I'd /guess/ it doesn't work with 
block-multiqueue (and thus not with bfq or kyber) at all, tho it's 
possible it does in the latest kernels, since multi-queue is targeted to 
eventually replace, at least as default, the older single-queue options.

So which scheduler are you using and are you on multi-queue or not?

Meanwhile, where ionice /does/ work, using normal nice 19 should place 
the process in low-priority batch mode, which should automatically lower 
the ionice priority (increasing nice), as well.  That's what I normally 
use for such things here, on gentoo, where I schedule my package builds 
at nice 19, tho I also do the actual builds on tmpfs, so they don't 
actually touch anything but memory for the build itself, only fetching 
the sources, storing the built binpkg, and installing it to the main 
system.

>>> This allows me to run I/O intensive background scripts in servers
>>> without the users noticing slowdowns or lagging, of course this means
>>> the process doing heavy I/O will run more slowly or get outright
>>> paused if higher-priority processes need a lot of access to the disk.
>>>
>>> It works on btrfs balance too, see (commandline example) [2].

There's a problem with that example.  See below.

>>> If you don't start the process with ionice as in [2], you can always
>>> change the priority later if you get the get the process ID. I use
>>> iotop [3], which also supports commandline arguments to integrate its
>>> output in scripts.
>>>
>>> For btrfs scrub it seems to be possible to specify the ionice options
>>> directly, while btrfs balance does not seem to have them (would be
>>> nice to add them imho). [4]
>>>
>>> For the sake of completeness, there is also "nice" tool for CPU usage
>>> priority (also used in my scripts on servers to keep the scripts from
>>> hogging the CPU for what is just a background process, and seen in [2]
>>> commandline too). [5]
>>>
>>> 1. http://man7.org/linux/man-pages/man1/ionice.1.html
>>> 2. https://unix.stackexchange.com/questions/390480/nice-and-ionice-
which-one-should-come-first
>>> 3. http://man7.org/linux/man-pages/man8/iotop.8.html
>>> 4. https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-scrub
>>> 5. http://man7.org/linux/man-pages/man1/nice.1.html

> It does not help at all.
> btrfs balance's behaviour seems to be unchanged by ionice.
> It still takes 100% while working and starves all other processes of
> disk access.

100% CPU, or 100% IO?  How are you measuring?  If iotop, 99% of time 
waiting on IO for an IO-bound process isn't bad, and doesn't mean nothing 
else can do IO first (tho 99% for that CCTV process /could/ be a problem, 
if it's normally much lower and only 99% because btrfs is taking what it 
needs).

100% of 

Re: btrfs balance problems

2017-12-27 Thread James Courtier-Dutton
Hi,

Thank you for your suggestion.
It does not help at all.
btrfs balance's behaviour seems to be unchanged by ionice.
It still takes 100% while working and starves all other processes of
disk access.

I can I get btrfs balance to work in the background, without adversely
affecting other applications?

>
> On 23 December 2017 at 11:56, Alberto Bursi  wrote:
>>
>>
>> On 12/23/2017 12:19 PM, James Courtier-Dutton wrote:
>>> Hi,
>>>
>>> During a btrfs balance, the process hogs all CPU.
>>> Or, to be exact, any other program that wishes to use the SSD during a
>>> btrfs balance is blocked for long periods. Long periods being more
>>> than 5 seconds.
>>> Is there any way to multiplex SSD access while btrfs balance is
>>> operating, so that other applications can still access the SSD with
>>> relatively low latency?
>>>
>>> My guess is that btrfs is doing a transaction with a large number of
>>> SSD blocks at a time, and thus blocking other applications.
>>>
>>> This makes for atrocious user interactivity as well as applications
>>> failing because they cannot access the disk in a relatively low latent
>>> manner.
>>> For, example, this is causing a High Definition network CCTV
>>> application to fail.
>>>
>>> What I would really like, is for some way to limit SSD bandwidths to
>>> applications.
>>> For example the CCTV app always gets the bandwidth it needs, and all
>>> other applications can still access the SSD, but are rate limited.
>>> This would fix my particular problem.
>>> We have rate limiting for network applications, why not disk access also?
>>>
>>> Kind Regards
>>>
>>> James
>>>
>>
>> On most I/O intensive programs in Linux you can use "ionice" tool to
>> change the disk access priority of a process. [1]
>> This allows me to run I/O intensive background scripts in servers
>> without the users noticing slowdowns or lagging, of course this means
>> the process doing heavy I/O will run more slowly or get outright paused
>> if higher-priority processes need a lot of access to the disk.
>>
>> It works on btrfs balance too, see (commandline example) [2].
>>
>> If you don't start the process with ionice as in [2], you can always
>> change the priority later if you get the get the process ID. I use iotop
>> [3], which also supports commandline arguments to integrate its output
>> in scripts.
>>
>> For btrfs scrub it seems to be possible to specify the ionice options
>> directly, while btrfs balance does not seem to have them (would be nice
>> to add them imho). [4]
>>
>> For the sake of completeness, there is also "nice" tool for CPU usage
>> priority (also used in my scripts on servers to keep the scripts from
>> hogging the CPU for what is just a background process, and seen in [2]
>> commandline too). [5]
>>
>> 1. http://man7.org/linux/man-pages/man1/ionice.1.html
>> 2.
>> https://unix.stackexchange.com/questions/390480/nice-and-ionice-which-one-should-come-first
>> 3. http://man7.org/linux/man-pages/man8/iotop.8.html
>> 4. https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-scrub
>> 5. http://man7.org/linux/man-pages/man1/nice.1.html
>>
>> -Alberto
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs balance problems

2017-12-23 Thread Alberto Bursi


On 12/23/2017 12:19 PM, James Courtier-Dutton wrote:
> Hi,
>
> During a btrfs balance, the process hogs all CPU.
> Or, to be exact, any other program that wishes to use the SSD during a
> btrfs balance is blocked for long periods. Long periods being more
> than 5 seconds.
> Is there any way to multiplex SSD access while btrfs balance is
> operating, so that other applications can still access the SSD with
> relatively low latency?
>
> My guess is that btrfs is doing a transaction with a large number of
> SSD blocks at a time, and thus blocking other applications.
>
> This makes for atrocious user interactivity as well as applications
> failing because they cannot access the disk in a relatively low latent
> manner.
> For, example, this is causing a High Definition network CCTV
> application to fail.
>
> What I would really like, is for some way to limit SSD bandwidths to
> applications.
> For example the CCTV app always gets the bandwidth it needs, and all
> other applications can still access the SSD, but are rate limited.
> This would fix my particular problem.
> We have rate limiting for network applications, why not disk access also?
>
> Kind Regards
>
> James
>

On most I/O intensive programs in Linux you can use "ionice" tool to 
change the disk access priority of a process. [1]
This allows me to run I/O intensive background scripts in servers 
without the users noticing slowdowns or lagging, of course this means 
the process doing heavy I/O will run more slowly or get outright paused 
if higher-priority processes need a lot of access to the disk.

It works on btrfs balance too, see (commandline example) [2].

If you don't start the process with ionice as in [2], you can always 
change the priority later if you get the get the process ID. I use iotop 
[3], which also supports commandline arguments to integrate its output 
in scripts.

For btrfs scrub it seems to be possible to specify the ionice options 
directly, while btrfs balance does not seem to have them (would be nice 
to add them imho). [4]

For the sake of completeness, there is also "nice" tool for CPU usage 
priority (also used in my scripts on servers to keep the scripts from 
hogging the CPU for what is just a background process, and seen in [2] 
commandline too). [5]

1. http://man7.org/linux/man-pages/man1/ionice.1.html
2. 
https://unix.stackexchange.com/questions/390480/nice-and-ionice-which-one-should-come-first
3. http://man7.org/linux/man-pages/man8/iotop.8.html
4. https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-scrub
5. http://man7.org/linux/man-pages/man1/nice.1.html

-Alberto
N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

btrfs balance problems

2017-12-23 Thread James Courtier-Dutton
Hi,

During a btrfs balance, the process hogs all CPU.
Or, to be exact, any other program that wishes to use the SSD during a
btrfs balance is blocked for long periods. Long periods being more
than 5 seconds.
Is there any way to multiplex SSD access while btrfs balance is
operating, so that other applications can still access the SSD with
relatively low latency?

My guess is that btrfs is doing a transaction with a large number of
SSD blocks at a time, and thus blocking other applications.

This makes for atrocious user interactivity as well as applications
failing because they cannot access the disk in a relatively low latent
manner.
For, example, this is causing a High Definition network CCTV
application to fail.

What I would really like, is for some way to limit SSD bandwidths to
applications.
For example the CCTV app always gets the bandwidth it needs, and all
other applications can still access the SSD, but are rate limited.
This would fix my particular problem.
We have rate limiting for network applications, why not disk access also?

Kind Regards

James
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html