Re: btrfs-cleaner / snapshot performance analysis

2018-02-12 Thread Ellis H. Wilson III
On 02/11/2018 01:03 PM, Hans van Kranenburg wrote: 3. I need to look at the code to understand the interplay between qgroups, snapshots, and foreground I/O performance as there isn't existing architecture documentation to point me to that covers this Well, the excellent write-up of Qu this

Re: btrfs-cleaner / snapshot performance analysis

2018-02-12 Thread Ellis H. Wilson III
On 02/11/2018 01:24 PM, Hans van Kranenburg wrote: Why not just use `btrfs fi du ` now and then and update your administration with the results? .. Instead of putting the burden of keeping track of all administration during every tiny change all day long? I will look into that if using

Re: btrfs-cleaner / snapshot performance analysis

2018-02-12 Thread Ellis H. Wilson III
On 02/12/2018 12:09 PM, Hans van Kranenburg wrote: You are in the To: of it: https://www.spinics.net/lists/linux-btrfs/msg74737.html Apparently MS365 decided my disabling of junk/clutter filter rules some year+ ago wasn't wise and re-enabled it. I wondered why I wasn't seeing my own

Re: btrfs-cleaner / snapshot performance analysis

2018-02-12 Thread Ellis H. Wilson III
On 02/12/2018 11:02 AM, Austin S. Hemmelgarn wrote: I will look into that if using built-in group capacity functionality proves to be truly untenable.  Thanks! As a general rule, unless you really need to actively prevent a subvolume from exceeding it's quota, this will generally be more

Re: btrfs-cleaner / snapshot performance analysis

2018-02-11 Thread Ellis H. Wilson III
Thanks Tomasz, Comments in-line: On 02/10/2018 05:05 PM, Tomasz Pala wrote: You won't have anything close to "accurate" in btrfs - quotas don't include space wasted by fragmentation, which happens to allocate from tens to thousands times (sic!) more space than the files itself. Not in some

Re: btrfs-cleaner / snapshot performance analysis

2018-02-11 Thread Ellis H. Wilson III
Thanks Hans. Sorry for the top-post, but I'm boiling things down here so I don't have a clear line-item to respond to. The take-aways I see here to my original queries are: 1. Nobody has done a thorough analysis of the impact of snapshot manipulation WITHOUT qgroups enabled on foreground

Status of FST and mount times

2018-02-14 Thread Ellis H. Wilson III
Hi again -- back with a few more questions: Frame-of-reference here: RAID0. Around 70TB raw capacity. No compression. No quotas enabled. Many (potentially tens to hundreds) of subvolumes, each with tens of snapshots. No control over size or number of files, but directory tree (entries

Re: Status of FST and mount times

2018-02-14 Thread Ellis H. Wilson III
On 02/14/2018 12:08 PM, Nikolay Borisov wrote: V1 for large filesystems is jut awful. Facebook have been experiencing the pain hence they implemented v2. You can view the spacecache tree as the complement version of the extent tree. v1 cache is implemented as a hidden inode and even though

Re: btrfs-cleaner / snapshot performance analysis

2018-02-10 Thread Ellis H. Wilson III
Thank you very much for your response Hans. Comments in-line, but I did want to handle one miscommunication straight-away: I'm a huge fan of BTRFS. If I came off like I was complaining, my sincere apologies. To be completely transparent we are using BTRFS in a very large project at my

Re: Status of FST and mount times

2018-02-15 Thread Ellis H. Wilson III
On 02/15/2018 01:14 AM, Chris Murphy wrote: On Wed, Feb 14, 2018 at 9:00 AM, Ellis H. Wilson III <ell...@panasas.com> wrote: Frame-of-reference here: RAID0. Around 70TB raw capacity. No compression. No quotas enabled. Many (potentially tens to hundreds) of subvolumes, each wit

Re: Status of FST and mount times

2018-02-15 Thread Ellis H. Wilson III
On 02/15/2018 11:51 AM, Austin S. Hemmelgarn wrote: There are scaling performance issues with directory listings on BTRFS for directories with more than a few thousand files, but they're not well documented (most people don't hit them because most applications are designed around the

Metadata / Data on Heterogeneous Media

2018-02-15 Thread Ellis H. Wilson III
In discussing the performance of various metadata operations over the past few days I've had this idea in the back of my head, and wanted to see if anybody had already thought about it before (likely, I would guess). It appears based on this page:

Re: Status of FST and mount times

2018-02-15 Thread Ellis H. Wilson III
On 02/14/2018 06:24 PM, Duncan wrote: Frame-of-reference here: RAID0. Around 70TB raw capacity. No compression. No quotas enabled. Many (potentially tens to hundreds) of subvolumes, each with tens of snapshots. No control over size or number of files, but directory tree (entries per dir and

Re: Status of FST and mount times

2018-02-15 Thread Ellis H. Wilson III
On 02/15/2018 06:12 AM, Hans van Kranenburg wrote: On 02/15/2018 02:42 AM, Qu Wenruo wrote: Just as said by Nikolay, the biggest problem of slow mount is the size of extent tree (and HDD seek time) The easiest way to get a basic idea of how large your extent tree is using debug tree: #

Re: Status of FST and mount times

2018-02-21 Thread Ellis H. Wilson III
On 02/20/2018 08:49 PM, Qu Wenruo wrote: On 2018年02月16日 22:12, Ellis H. Wilson III wrote: $ sudo btrfs-debug-tree -t chunk /dev/sdb | grep CHUNK_ITEM | wc -l 3454 Increasing node size may reduce extent tree size. Although at most reduce one level AFAIK. But considering that the higher

Re: Status of FST and mount times

2018-02-20 Thread Ellis H. Wilson III
On 02/16/2018 07:59 PM, Qu Wenruo wrote: On 2018年02月16日 22:12, Ellis H. Wilson III wrote: $ sudo btrfs-debug-tree -t chunk /dev/sdb | grep CHUNK_ITEM | wc -l 3454 OK, this explains everything. There are too many chunks. This means at mount you need to search for block group item 3454 times

Re: [RFC PATCH] btrfs: Speedup btrfs_read_block_groups()

2018-02-22 Thread Ellis H. Wilson III
tree search, instead of searching with some uncertain value and do forward search. In some case, like next BLOCK_GROUP_ITEM is in the next leaf of current path, we could save such unnecessary tree block read. Cc: Ellis H. Wilson III <ell...@panasas.com> Hi Ellis, Would you plea

Re: Status of FST and mount times

2018-02-16 Thread Ellis H. Wilson III
On 02/15/2018 08:55 PM, Qu Wenruo wrote: On 2018年02月16日 00:30, Ellis H. Wilson III wrote: Very helpful information.  Thank you Qu and Hans! I have about 1.7TB of homedir data newly rsync'd data on a single enterprise 7200rpm HDD and the following output for btrfs-debug: extent tree key

Re: Status of FST and mount times

2018-02-16 Thread Ellis H. Wilson III
On 02/16/2018 09:20 AM, Hans van Kranenburg wrote: Well, imagine you have a big tree (an actual real life tree outside) and you need to pick things (e.g. apples) which are hanging everywhere. So, what you need to to is climb the tree, climb on a branch all the way to the end where the first

Re: Status of FST and mount times

2018-02-16 Thread Ellis H. Wilson III
On 02/16/2018 09:42 AM, Ellis H. Wilson III wrote: On 02/16/2018 09:20 AM, Hans van Kranenburg wrote: Well, imagine you have a big tree (an actual real life tree outside) and you need to pick things (e.g. apples) which are hanging everywhere. So, what you need to to is climb the tree, climb

Re: Metadata / Data on Heterogeneous Media

2018-02-15 Thread Ellis H. Wilson III
On 02/15/2018 02:11 PM, Hugo Mills wrote: On Thu, Feb 15, 2018 at 12:15:49PM -0500, Ellis H. Wilson III wrote: In discussing the performance of various metadata operations over the past few days I've had this idea in the back of my head, and wanted to see if anybody had already thought about

Re: Metadata / Data on Heterogeneous Media

2018-02-15 Thread Ellis H. Wilson III
On 02/15/2018 02:06 PM, Adam Borowski wrote: On Thu, Feb 15, 2018 at 12:15:49PM -0500, Ellis H. Wilson III wrote: In discussing the performance of various metadata operations over the past few days I've had this idea in the back of my head, and wanted to see if anybody had already thought about

Re: [RFC PATCH] btrfs: Speedup btrfs_read_block_groups()

2018-02-23 Thread Ellis H. Wilson III
On 02/22/2018 06:37 PM, Qu Wenruo wrote: On 2018年02月23日 00:31, Ellis H. Wilson III wrote: On 02/21/2018 11:56 PM, Qu Wenruo wrote: On 2018年02月22日 12:52, Qu Wenruo wrote: btrfs_read_block_groups() is used to build up the block group cache for all block groups, so it will iterate all block