On Mon, Feb 8, 2021 at 3:11 PM Goffredo Baroncelli <kreij...@libero.it> wrote:
>
> On 2/7/21 11:06 PM, Chris Murphy wrote:
> > systemd-journald journals on Btrfs default to nodatacow,  upon log
> > rotation it's submitted for defragmenting with BTRFS_IOC_DEFRAG. The
> > result looks curious. I can't tell what the logic is from the results.
> >
> > The journal file starts out being fallocated with a size of 8MB, and
> > as it grows there is an append of 8MB increments, also fallocated.
> > This leads to a filefrag -v that looks like this (ext4 and btrfs
> > nodatacow follow the same behavior, both are provided for reference):
> >
> > ext4
> > https://pastebin.com/6vuufwXt
> >
> > btrfs
> > https://pastebin.com/Y18B2m4h
> >
> > Following defragment with BTRFS_IOC_DEFRAG it looks like this:
> > https://pastebin.com/1ufErVMs
> >
> > It appears at first glance to be significantly more fragmented. Closer
> > inspection shows that most of the extents weren't relocated. But
> > what's up with the peculiar interleaving? Is this an improvement over
> > the original allocation?
>
> I am not sure how read the filefrag output: I see several lines like
> [...]
>     5:     1691..    1693:     125477..    125479:      3:
>     6:     1694..    1694:     125480..    125480:      1:             
> unwritten
> [...]
>
> What means "unwritten" ? The kernel documentation [*] says:


My understanding is it's an exent that's been fallocated but not yet
written to. What I don't know is whether they are possibly tripping up
BTRFS_IOC_DEFRAG. I'm not skilled enough to create a bunch of these
journal logs quickly (I'd have to just let a system run and age its
own journals, which sucks, it takes forever) and then a small program
that runs the same file through BTRFS_IOC_DEFRAG twice to see if it's
idempotent. The resulting file after one submission does not have
unwritten extents.

Another thing I'm not sure of is whether ssd vs nossd affects the
defrag results. Or datacow versus nodatacow.

Another thing I'm not sure of is if autodefrag is a better solution to
the problem. Whereby it acts as a no op when the file is nodatacow,
and does the expected thing if it's datacow. But then we'd need an
autodefrag xattr to set on the enclosing directory for these journals
because there's no reliable way to set autodefrag mount option
globally, not knowing all the work loads. It can make some workloads
worse.



> My educate guess is that there is something strange in the sequence:
> - write
> - sync
> - close log
> - move log
> - defrag log
>
> May be the defrag starts before all the data reach the platters ?

Perhaps. Attach strace to journald before --rotate, and then --rotate

https://pastebin.com/UGihfCG9

>
> For what matters, I create a file with the same fragmentation like your one
>
> $ sudo filefrag -v data.txt
> Filesystem type is: 9123683e
> File size of data.txt is 25165824 (6144 blocks of 4096 bytes)
>   ext:     logical_offset:        physical_offset: length:   expected: flags:
>     0:        0..       0:    1597171..   1597171:      1:
>     1:        1..    1599:  163433285.. 163434883:   1599:    1597172:
>     2:     1600..    1607:    1601255..   1601262:      8:  163434884:
>     3:     1608..    1689:    1604137..   1604218:     82:    1601263:
>     4:     1690..    1690:    1597484..   1597484:      1:    1604219:
>     5:     1691..    1693:    1597465..   1597467:      3:    1597485:
>     6:     1694..    1694:    1597966..   1597966:      1:    1597468:
>     7:     1695..    1722:    1599557..   1599584:     28:    1597967:
>     8:     1723..    1723:    1599211..   1599211:      1:    1599585:
>     9:     1724..    1955:    1648394..   1648625:    232:    1599212:
>    10:     1956..    1956:    1599695..   1599695:      1:    1648626:
>    11:     1957..    2047:    1625881..   1625971:     91:    1599696:
>    12:     2048..    2417:    1648804..   1649173:    370:    1625972:
>    13:     2418..    2420:    1597468..   1597470:      3:    1649174:
>    14:     2421..    2478:    1624667..   1624724:     58:    1597471:
>    15:     2479..    2479:    1596416..   1596416:      1:    1624725:
>    16:     2480..    2482:    1601045..   1601047:      3:    1596417:
>    17:     2483..    2483:    1596854..   1596854:      1:    1601048:
>    18:     2484..    2523:    1602715..   1602754:     40:    1596855:
>    19:     2524..    2527:    1597471..   1597474:      4:    1602755:
>    20:     2528..    2598:    1624725..   1624795:     71:    1597475:
>    21:     2599..    2599:    1596858..   1596858:      1:    1624796:
>    22:     2600..    2607:    1601263..   1601270:      8:    1596859:
>    23:     2608..    2608:    1596863..   1596863:      1:    1601271:
>    24:     2609..    2611:    1601271..   1601273:      3:    1596864:
>    25:     2612..    2612:    1596864..   1596864:      1:    1601274:
>    26:     2613..    2615:    1601274..   1601276:      3:    1596865:
>    27:     2616..    2616:    1596981..   1596981:      1:    1601277:
>    28:     2617..    2691:    1649174..   1649248:     75:    1596982:
>    29:     2692..    2696:    1597475..   1597479:      5:    1649249:
>    30:     2697..    2756:    1634995..   1635054:     60:    1597480:
>    31:     2757..    2758:    1597480..   1597481:      2:    1635055:
>    32:     2759..    2762:    1601351..   1601354:      4:    1597482:
>    33:     2763..    2764:    1597482..   1597483:      2:    1601355:
>    34:     2765..    2837:    1649249..   1649321:     73:    1597484:
>    35:     2838..    2838:    1597038..   1597038:      1:    1649322:
>    36:     2839..    2855:    1601538..   1601554:     17:    1597039:
>    37:     2856..    2856:    1597045..   1597045:      1:    1601555:
>    38:     2857..    2904:    1624547..   1624594:     48:    1597046:
>    39:     2905..    2926:    1600795..   1600816:     22:    1624595:
>    40:     2927..    2942:    1602034..   1602049:     16:    1600817:
>    41:     2943..    2963:    1600817..   1600837:     21:    1602050:
>    42:     2964..    2979:    1602183..   1602198:     16:    1600838:
>    43:     2980..    3001:    1600927..   1600948:     22:    1602199:
>    44:     3002..    3043:    1621164..   1621205:     42:    1600949:
>    45:     3044..    3053:    1599231..   1599240:     10:    1621206:
>    46:     3054..    3066:    1601952..   1601964:     13:    1599241:
>    47:     3067..    3067:    1597056..   1597056:      1:    1601965:
>    48:     3068..    3084:    1602375..   1602391:     17:    1597057:
>    49:     3085..    3094:    1599290..   1599299:     10:    1602392:
>    50:     3095..    3096:    1601355..   1601356:      2:    1599300:
>    51:     3097..    3107:    1600717..   1600727:     11:    1601357:
>    52:     3108..    3156:    1642892..   1642940:     49:    1600728:
>    53:     3157..    3157:    1597059..   1597059:      1:    1642941:
>    54:     3158..    3251:    1649322..   1649415:     94:    1597060:
>    55:     3252..    3254:    1599241..   1599243:      3:    1649416:
>    56:     3255..    3304:    1645466..   1645515:     50:    1599244:
>    57:     3305..    3305:    1597100..   1597100:      1:    1645516:
>    58:     3306..    3312:    1601357..   1601363:      7:    1597101:
>    59:     3313..    3319:    1599300..   1599306:      7:    1601364:
>    60:     3320..    3331:    1601611..   1601622:     12:    1599307:
>    61:     3332..    3339:    1600838..   1600845:      8:    1601623:
>    62:     3340..    3343:    1601419..   1601422:      4:    1600846:
>    63:     3344..    3351:    1600846..   1600853:      8:    1601423:
>    64:     3352..    3432:    1649416..   1649496:     81:    1600854:
>    65:     3433..    3433:    1597109..   1597109:      1:    1649497:
>    66:     3434..    3489:    1649497..   1649552:     56:    1597110:
>    67:     3490..    3491:    1599227..   1599228:      2:    1649553:
>    68:     3492..    3521:    1619348..   1619377:     30:    1599229:
>    69:     3522..    3523:    1599307..   1599308:      2:    1619378:
>    70:     3524..    3530:    1601688..   1601694:      7:    1599309:
>    71:     3531..    3539:    1600949..   1600957:      9:    1601695:
>    72:     3540..    3579:    1629356..   1629395:     40:    1600958:
>    73:     3580..    3580:    1597124..   1597124:      1:    1629396:
>    74:     3581..    3601:    1604219..   1604239:     21:    1597125:
>    75:     3602..    3603:    1599585..   1599586:      2:    1604240:
>    76:     3604..    3614:    1602636..   1602646:     11:    1599587:
>    77:     3615..    3616:    1599587..   1599588:      2:    1602647:
>    78:     3617..    3677:    1649553..   1649613:     61:    1599589:
>    79:     3678..    3680:    1599692..   1599694:      3:    1649614:
>    80:     3681..    3723:    1647818..   1647860:     43:    1599695:
>    81:     3724..    3726:    1599821..   1599823:      3:    1647861:
>    82:     3727..    3756:    1622218..   1622247:     30:    1599824:
>    83:     3757..    3759:    1600630..   1600632:      3:    1622248:
>    84:     3760..    3766:    1603288..   1603294:      7:    1600633:
>    85:     3767..    3768:    1600633..   1600634:      2:    1603295:
>    86:     3769..    3950:   76053306..  76053487:    182:    1600635:
>    87:     3951..    3958:    1600958..   1600965:      8:   76053488:
>    88:     3959..    3986:    1619921..   1619948:     28:    1600966:
>    89:     3987..    3995:    1600966..   1600974:      9:    1619949:
>    90:     3996..    4036:    1649614..   1649654:     41:    1600975:
>    91:     4037..    4045:    1600975..   1600983:      9:    1649655:
>    92:     4046..    4050:    1601423..   1601427:      5:    1600984:
>    93:     4051..    4052:    1600854..   1600855:      2:    1601428:
>    94:     4053..    4055:    1601555..   1601557:      3:    1600856:
>    95:     4056..    4056:    1597129..   1597129:      1:    1601558:
>    96:     4057..    4059:    1601745..   1601747:      3:    1597130:
>    97:     4060..    4060:    1597134..   1597134:      1:    1601748:
>    98:     4061..    4063:    1602050..   1602052:      3:    1597135:
>    99:     4064..    4064:    1597137..   1597137:      1:    1602053:
>   100:     4065..    4079:    1604297..   1604311:     15:    1597138:
>   101:     4080..    4088:    1600987..   1600995:      9:    1604312:
>   102:     4089..    4095:    1603295..   1603301:      7:    1600996:
>   103:     4096..    4106:    1600996..   1601006:     11:    1603302:
>   104:     4107..    4117:    1622600..   1622610:     11:    1601007:
>   105:     4118..    4119:    1601007..   1601008:      2:    1622611:
>   106:     4120..    4129:    1622611..   1622620:     10:    1601009:
>   107:     4130..    4131:    1601009..   1601010:      2:    1622621:
>   108:     4132..    4141:    1622621..   1622630:     10:    1601011:
>   109:     4142..    4145:    1601011..   1601014:      4:    1622631:
>   110:     4146..    4155:    1622986..   1622995:     10:    1601015:
>   111:     4156..    4157:    1601015..   1601016:      2:    1622996:
>   112:     4158..    4168:    1622996..   1623006:     11:    1601017:
>   113:     4169..    4170:    1601017..   1601018:      2:    1623007:
>   114:     4171..    4180:    1623007..   1623016:     10:    1601019:
>   115:     4181..    4182:    1601019..   1601020:      2:    1623017:
>   116:     4183..    4192:    1624473..   1624482:     10:    1601021:
>   117:     4193..    4195:    1601021..   1601023:      3:    1624483:
>   118:     4196..    4205:    1624796..   1624805:     10:    1601024:
>   119:     4206..    4207:    1601024..   1601025:      2:    1624806:
>   120:     4208..    4217:    1624806..   1624815:     10:    1601026:
>   121:     4218..    4220:    1601026..   1601028:      3:    1624816:
>   122:     4221..    4230:    1625972..   1625981:     10:    1601029:
>   123:     4231..    4408:    1648626..   1648803:    178:    1625982:
>   124:     4409..    4411:    1602199..   1602201:      3:    1648804:
>   125:     4412..    4434:    1601328..   1601350:     23:    1602202:
>   126:     4435..    4437:    1602647..   1602649:      3:    1601351:
>   127:     4438..    4439:    1601029..   1601030:      2:    1602650:
>   128:     4440..    4442:    1602755..   1602757:      3:    1601031:
>   129:     4443..    4480:    1601650..   1601687:     38:    1602758:
>   130:     4481..    4491:    1629530..   1629540:     11:    1601688:
>   131:     4492..    4560:    1624404..   1624472:     69:    1629541:
>   132:     4561..    4571:    1629541..   1629551:     11:    1624473:
>   133:     4572..    4582:    1601031..   1601041:     11:    1629552:
>   134:     4583..    4586:    1603302..   1603305:      4:    1601042:
>   135:     4587..    4620:    1602537..   1602570:     34:    1603306:
>   136:     4621..    4631:    1629716..   1629726:     11:    1602571:
>   137:     4632..    4634:    1601042..   1601044:      3:    1629727:
>   138:     4635..    6143:  156004864.. 156006372:   1509:    1601045: 
> last,eof
> data.txt: 139 extents found
>
> the I tried to defrag it
>
> $ btrfs fi defra  data.txt
> $ sudo filefrag -v data.txt
> Filesystem type is: 9123683e
> File size of data.txt is 25165824 (6144 blocks of 4096 bytes)
>   ext:     logical_offset:        physical_offset: length:   expected: flags:
>     0:        0..    6143:  164002967.. 164009110:   6144:             
> last,eof
> data.txt: 1 extent found
>
> So it seems that the defrag works

I get different results between BTRFS_IOC_DEFRAG which is what
systemd-journald uses, and BTRFS_IOC_DEFRAG_RANGE which is what 'btrfs
fi defrag' is using with a default len of 32M.

Another question about BTRFS_IOC_DEFRAG is if it's intended to be
minimalist? Does it have a way to estimate fragmentation and just not
do anything? Because the journald nodatacow journals are not
meaningfully fragmented. They are the same on ext4 and on Btrfs - it's
(so far) always 8MB extents, directly related to each fallocate grow
of the journal file. This kind of faux-fragmentation I think is minor
even on a HDD because it's the same on ext4 and XFS and no one
complains there (as far as I'm aware).


-- 
Chris Murphy

Reply via email to