Hi!

Sorry if this is the wrong address to ask about "user problems".

I am currently investigating various filesystems for use on drive-managed SMR
drives (e.g. the seagate 8TB disks). These drives have characteristics not
unlike flash (they want to be written in large batches), but are, of course,
still quite different.

I initially tried btrfs, ext4, xfs which, not unsurprisingly, failed rather
miserably after a few hundred GB, down to ~30mb/s (or 20 in case of btrfs).

I also tried nilfs, which should be an almost perfect match for this
technology, but it performed even worse (I have no clue why, maybe nilfs
skips sectors when writing, which would explain it).

As a last resort, I tried f2fs, which initially performed absolutely great
(average write speed ~130mb/s over multiple terabytes).

However, I am running into a number of problems, and wonder if f2fs can
somehow be configured to work right.

First of all, I did most of my tests on linux-3.18.14, and recently
switched to 4.1.4. The filesystems were formatted with "-s7", the idea
being that writes always occur in 256MB blocks as much as possible, and
most importantly, are freed in 256MB blocks, to keep fragmentation low.

Mount options included noatime or 
noatime,inline_xattr,inline_data,inline_dentry,flush_merge,extent_cache
(I suspect 4.1.4 doesn't implement flush_merge yet?).

My first problem considers ENOSPC problem - I was happily able to write to a
100% utilized filesystem with cp and rsync continuing to write, not receiving
any error, but no write activity occuring (and the files never ending up on
the filesystem). Is this a known bug?

My second, much bigger problem, considers defragmentation. For testing,
I created a 128GB partition and kept writing an assortment of 200kb -
multiple megabyte files to it. To stress test it, I kept deleting random
files to create holes. after a while (around 84% utilisation), write
performance went down to less than 1MB/s, and is at this leve ever since
for this filesystem.

I kept the filesystem idle for a night to hope for defragmentation, but
nothing happened. Suspecting in-place-updates to be the culprit, I tried
various configurations in the hope of disabling them (such as setting
ipu_policy to 4 or 8, and/or setting min_ipu_util to 0 or 100), but that
also doesn't seem to have any effect whatsoever.

>From the description of f2fs, it seems to be quite close to ideal for these
drives, as it should be possible to write mostly linearly, and keep
fragmentation low by freeing big sequentials sections of data.

Pity that it's so close and then fails so miserably after performing so
admirably initially - can anything be done about this, in way of
configuration, or is my understanding of how f2fs writes and garbage collects
flawed?

Here is the output of /sys/kernel/debug/f2fs/status for the filesysstem in
question. This was after keeping it idle for a night, then unmounting and
remounting the volume. Before the unmount, it had very high values for in
the GC calls section, but no reads have been observed during the night,
just writes (using dstat -Dsdx).

   =====[ partition info(dm-9). #1 ]=====
   [SB: 1] [CP: 2] [SIT: 6] [NAT: 114] [SSA: 130] [MAIN: 65275(OverProv:2094 
Resv:1456)]

   Utilization: 84% (27320244 valid blocks)
     - Node: 31936 (Inode: 5027, Other: 26909)
     - Data: 27288308
     - Inline_data Inode: 0
     - Inline_dentry Inode: 0

   Main area: 65275 segs, 9325 secs 9325 zones
     - COLD  data: 12063, 1723, 1723
     - WARM  data: 12075, 1725, 1725
     - HOT   data: 65249, 9321, 9321
     - Dir   dnode: 65269, 9324, 9324
     - File   dnode: 24455, 3493, 3493
     - Indir nodes: 65260, 9322, 9322

     - Valid: 52278
     - Dirty: 9
     - Prefree: 0
     - Free: 12988 (126)

   CP calls: 10843
   GC calls: 91 (BG: 11)
     - data segments : 21 (0)
     - node segments : 70 (0)
   Try to move 30355 blocks (BG: 0)
     - data blocks : 7360 (0)
     - node blocks : 22995 (0)

   Extent Hit Ratio: 8267 / 24892

   Extent Tree Count: 3130

   Extent Node Count: 3138

   Balancing F2FS Async:
     - inmem:    0, wb:    0
     - nodes:    0 in 5672
     - dents:    0 in dirs:   0
     - meta:    0 in 3567
     - NATs:         0/     9757
     - SITs:         0/    65275
     - free_nids:       868

   Distribution of User Blocks: [ valid | invalid | free ]
     [------------------------------------------||--------]

   IPU: 0 blocks
   SSR: 0 blocks in 0 segments
   LFS: 49114 blocks in 95 segments

   BDF: 64, avg. vblocks: 1254

   Memory: 48948 KB
     - static: 11373 KB
     - cached: 619 KB
     - paged : 36956 KB

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schm...@schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\

------------------------------------------------------------------------------
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to