Re: too many lstat() syscalls, therefore too many IOPS
Ups, somehow I sent my e-mail twice... But at least now I know about ionice; never needed it, but it's even installed by default (it's part of util-linux, together with mount, fdisk, etc...). I'm now browsing through the other utilities of this package, full of small jewels. You never stop learning... Thanks, Eric On 12/05/2021 23:59, griffin tucker wrote: there is a tool called ionice that comes with the util-linux package, which should help other read/write processes if you are multitasking. ionice shouldn't slow down rdiff-backup if there are no other io processes going on. On Thu, 13 May 2021 at 03:08, Eric L. Zolf wrote: Hi, first, I don't see anything surprising in what you describe, so all normal AFAICJ. Second, rdiff-backup needs to check each source file/directory and each target, compare them and then copy (or not), so if you have some 2300 files to backup, that would sound about right. If the target or the source file doesn't exist, it would give an error. If the files are small or don't have changes, the lstat happen a lot and nothing much else; this is typical random access. It gives a much different access pattern than the copying of bigger files, where more sequential is typically done to read/write the file's data. There is no real way to improve the situation, rdiff-backup goes as fast as it can and I personally don't know an I/O-equivalent of "nice" (and if you limit the I/O, the backup will be even slower). You could try the --no-fsync option to improve speed: --fsync, --no-fsync [opt] do (or not) often sync the file system (_not_ doing it is faster but can be dangerous) And, yes, the `rdiff-backup-data/increments` directory is used by rdiff-backup to keep track of file and directory changes. Hope this helps, Eric On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: Hi rdiff-backup folks, Since recent, during backing up I can see spike in IOPS up to 500 which exhaust limit of a VM. Therefore backup process takes very long. I've straced a bit and what I can see is: many failed lstat() syscalls: % time seconds usecs/call callserrors syscall -- --- --- - - 42.710.040247 9 4608 1420 lstat 35.410.033370 12 2860 getdents 9.410.008865 6 1431 open 4.630.004363 3 1430 close 4.030.003797 3 1431 fstat 3.750.003536 2 1417 getuid 0.040.39 39 1 unlink 0.010.13 1 9 read -- --- --- - - 100.000.094230 13187 1420 total Seems rdiff-backup checks existence of some file/dir: 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> After backup is done, there is still no such file. Seems the part in path - /rdiff-backup-data/increments/ - is some "config" for rdiff-backup and probably it tryies to find something but can't? What might be wrong in my setup? What would you recommend to check to solve the issue if it is issue at all? --- Best Regards, Andrei Enshin
Re: too many lstat() syscalls, therefore too many IOPS
"Eric L. Zolf" wrote: > I personally don't know an I/O-equivalent of "nice". Try "ionice". ;-) -- Yves Bellefeuille
Re: too many lstat() syscalls, therefore too many IOPS
there is a tool called ionice that comes with the util-linux package, which should help other read/write processes if you are multitasking. ionice shouldn't slow down rdiff-backup if there are no other io processes going on. On Thu, 13 May 2021 at 03:08, Eric L. Zolf wrote: > > Hi, > > first, I don't see anything surprising in what you describe, so all > normal AFAICJ. > > Second, rdiff-backup needs to check each source file/directory and each > target, compare them and then copy (or not), so if you have some 2300 > files to backup, that would sound about right. If the target or the > source file doesn't exist, it would give an error. > > If the files are small or don't have changes, the lstat happen a lot and > nothing much else; this is typical random access. It gives a much > different access pattern than the copying of bigger files, where more > sequential is typically done to read/write the file's data. > > There is no real way to improve the situation, rdiff-backup goes as fast > as it can and I personally don't know an I/O-equivalent of "nice" (and > if you limit the I/O, the backup will be even slower). > > You could try the --no-fsync option to improve speed: > >--fsync, --no-fsync [opt] do (or not) often sync the file system > (_not_ doing it is faster but can be dangerous) > > And, yes, the `rdiff-backup-data/increments` directory is used by > rdiff-backup to keep track of file and directory changes. > > Hope this helps, > Eric > > On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: > > > > Hi rdiff-backup folks, > > > > Since recent, during backing up I can see spike in IOPS up to 500 which > > exhaust limit of a VM. Therefore backup process takes very long. I've > > straced a bit and what I can see is: many failed lstat() syscalls: > > % time seconds usecs/call callserrors syscall > > -- --- --- - - > > 42.710.040247 9 4608 1420 lstat > > 35.410.033370 12 2860 getdents > >9.410.008865 6 1431 open > >4.630.004363 3 1430 close > >4.030.003797 3 1431 fstat > >3.750.003536 2 1417 getuid > >0.040.39 39 1 unlink > >0.010.13 1 9 read > > -- --- --- - - > > 100.000.094230 13187 1420 total > > Seems rdiff-backup checks existence of some file/dir: > > 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", > > 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> > > After backup is done, there is still no such file. > > Seems the part in path - /rdiff-backup-data/increments/ - is some "config" > > for rdiff-backup and probably it tryies to find something but can't? > > > > What might be wrong in my setup? What would you recommend to check to solve > > the issue if it is issue at all? > > > > --- > > Best Regards, > > Andrei Enshin > > >
Re: too many lstat() syscalls, therefore too many IOPS
Hi, first, I don't see anything surprising in what you describe, so all normal AFAICJ. Second, rdiff-backup needs to check each source file/directory and each target, compare them and then copy (or not), so if you have some 2300 files to backup, that would sound about right. If the target or the source file doesn't exist, it would give an error. If the files are small or don't have changes, the lstat happen a lot and nothing much else; this is typical random access. It gives a much different access pattern than the copying of bigger files, where more sequential is typically done to read/write the file's data. There is no real way to improve the situation, rdiff-backup goes as fast as it can and I personally don't know an I/O-equivalent of "nice" (and if you limit the I/O, the backup will be even slower). You could try the --no-fsync option to improve speed: --fsync, --no-fsync [opt] do (or not) often sync the file system (_not_ doing it is faster but can be dangerous) And, yes, the `rdiff-backup-data/increments` directory is used by rdiff-backup to keep track of file and directory changes. Hope this helps, Eric On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: Hi rdiff-backup folks, Since recent, during backing up I can see spike in IOPS up to 500 which exhaust limit of a VM. Therefore backup process takes very long. I've straced a bit and what I can see is: many failed lstat() syscalls: % time seconds usecs/call callserrors syscall -- --- --- - - 42.710.040247 9 4608 1420 lstat 35.410.033370 12 2860 getdents 9.410.008865 6 1431 open 4.630.004363 3 1430 close 4.030.003797 3 1431 fstat 3.750.003536 2 1417 getuid 0.040.39 39 1 unlink 0.010.13 1 9 read -- --- --- - - 100.000.094230 13187 1420 total Seems rdiff-backup checks existence of some file/dir: 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> After backup is done, there is still no such file. Seems the part in path - /rdiff-backup-data/increments/ - is some "config" for rdiff-backup and probably it tryies to find something but can't? What might be wrong in my setup? What would you recommend to check to solve the issue if it is issue at all? --- Best Regards, Andrei Enshin
Re: Re[4]: too many lstat() syscalls, therefore too many IOPS
We know that check-destination-dir is especially slow, slower than the backup. IIRC there is even an issue open for the regress speed. It just requires time to look into it, and I don't even know if an improvement is possible. The "no such file or directory" thingy doesn't sound normal, probably worth a bug report... Eric On May 12, 2021 2:24:23 PM UTC, Andrei Enshin wrote: > >Okay, seems I can see the reason of such behavior. > >Sorry for disturbing with such questions. > >We do run backup every 4 hours and seems there is 7200 seconds timeout. >It means rdiff-backup will be killed and then we will run it again with >`--check-destination-dir` option which causes very intensive disk usage >by doing a lot of lstat(). > >That is my current understanding. > > >Now it is still unclear to me why the ` --check-destination-dir` does >so many lstat() and why it fails on checking some dir: > >Exception '[Errno 2] No such file or directory: >'/some/path/rdiff-backup-data/increments/foo/aa.2021-05-12T04:15:01Z.dir'' >raised of class '': > >>Среда, 12 мая 2021, 23:01 +09:00 от Andrei Enshin >: >> >>Seems it does a lot of lstat() during run with option >`--check-destination-dir` >> >>Which is fallback in case backup can’t be finished. Hm. >> >> >>>Среда, 12 мая 2021, 22:44 +09:00 от Andrei Enshin < >and.ens...@gmail.com >: >>> >>>Hi, >>> >>>Thank you for the explanation. >>> >>>During backup rdiff-backup did lstat for >>>/some/path/rdiff-backup-data/increments/foo/bar >>>which returned — ENOENT . >>> >>>Does it mean it tried to check some file in increments which is not >here? >>>If it is not in increments, does it mean it was never backed up? >>>If all above statements are true, why after backup is done, there is >still no such file? Is it expected? >>> >>> >>>I’ve just played a bit with rdiff-backup on my local. >>> >>># at /tmp/tmp.jondxmEQDC >>>$ cat > a >>>aaa >>>^D >>> >>># at /tmp/tmp.zh49h057dq $ mkdir bckp >>>$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >>>$ ls bckp/rdiff-backup-data/increments/ >>># empty >>> >>> >>>It means after very first backup there is nothing in increments. >Let’s add new file and do backup once again: >>> >>># at /tmp/tmp.jondxmEQDC >>>$ cat > b >>>bbb >>>^D >>># at /tmp/tmp.zh49h057dq >>>$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >>>$ ls bckp/rdiff-backup-data/increments/ >>>b.2021-05-12T22:11:02+09:00.missing >>> >>> >>>I can see a record for new file with .missing suffix. >>> >>>However in case of `lstat()` it tries to access something which has >not such suffix. >>>What it tries to access? >>> >>> Среда, 12 мая 2021, 20:02 +09:00 от Eric L. Zolf < >ewl+rdiffbac...@lavar.de >: Hi, first, I don't see anything surprising in what you describe, so all normal AFAICJ. Second, rdiff-backup needs to check each source file/directory and >each target, compare them and then copy (or not), so if you have some >2300 files to backup, that would sound about right. If the target or the source file doesn't exist, it would give an error. If the files are small or don't have changes, the lstat happen a lot >and nothing much else; this is typical random access. It gives a much different access pattern than the copying of bigger files, where >more sequential is typically done to read/write the file's data. There is no real way to improve the situation, rdiff-backup goes as >fast as it can and I personally don't know an I/O-equivalent of "nice" >(and if you limit the I/O, the backup will be even slower). You could try the --no-fsync option to improve speed: --fsync, --no-fsync [opt] do (or not) often sync the file system (_not_ doing it is faster but can be dangerous) And, yes, the `rdiff-backup-data/increments` directory is used by rdiff-backup to keep track of file and directory changes. Hope this helps, Eric On 12/05/2021 07:10, Andrei Enshin via Any discussion of >rdiff-backup wrote: > > Hi rdiff-backup folks, > > Since recent, during backing up I can see spike in IOPS up to 500 >which exhaust limit of a VM. Therefore backup process takes very long. >I've straced a bit and what I can see is: many failed lstat() syscalls: > % time seconds usecs/call calls errors syscall > -- --- --- - - > > 42.71 0.040247 9 4608 1420 lstat > 35.41 0.033370 12 2860 getdents > 9.41 0.008865 6 1431 open > 4.63 0.004363 3 1430 close > 4.03 0.003797 3 1431 fstat > 3.75 0.003536 2 1417 getuid > 0.04 0.39 39 1 unlink > 0.01 0.13 1 9 read > -- --- --- - - > > 100.00 0.094230 13187 1420 total > Seems rdiff-backup checks existence of some file/dir: > 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", >0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> > After
Re[3]: too many lstat() syscalls, therefore too many IOPS
Seems it does a lot of lstat() during run with option `--check-destination-dir` Which is fallback in case backup can’t be finished. Hm. >Среда, 12 мая 2021, 22:44 +09:00 от Andrei Enshin : > >Hi, > >Thank you for the explanation. > >During backup rdiff-backup did lstat for >/some/path/rdiff-backup-data/increments/foo/bar >which returned — ENOENT . > >Does it mean it tried to check some file in increments which is not here? >If it is not in increments, does it mean it was never backed up? >If all above statements are true, why after backup is done, there is still no >such file? Is it expected? > > >I’ve just played a bit with rdiff-backup on my local. > ># at /tmp/tmp.jondxmEQDC >$ cat > a >aaa >^D > ># at /tmp/tmp.zh49h057dq $ mkdir bckp >$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >$ ls bckp/rdiff-backup-data/increments/ ># empty > > >It means after very first backup there is nothing in increments. Let’s add new >file and do backup once again: > ># at /tmp/tmp.jondxmEQDC >$ cat > b >bbb >^D ># at /tmp/tmp.zh49h057dq >$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >$ ls bckp/rdiff-backup-data/increments/ >b.2021-05-12T22:11:02+09:00.missing > > >I can see a record for new file with .missing suffix. > >However in case of `lstat()` it tries to access something which has not such >suffix. >What it tries to access? > > >>Среда, 12 мая 2021, 20:02 +09:00 от Eric L. Zolf < ewl+rdiffbac...@lavar.de >: >> >>Hi, >> >>first, I don't see anything surprising in what you describe, so all >>normal AFAICJ. >> >>Second, rdiff-backup needs to check each source file/directory and each >>target, compare them and then copy (or not), so if you have some 2300 >>files to backup, that would sound about right. If the target or the >>source file doesn't exist, it would give an error. >> >>If the files are small or don't have changes, the lstat happen a lot and >>nothing much else; this is typical random access. It gives a much >>different access pattern than the copying of bigger files, where more >>sequential is typically done to read/write the file's data. >> >>There is no real way to improve the situation, rdiff-backup goes as fast >>as it can and I personally don't know an I/O-equivalent of "nice" (and >>if you limit the I/O, the backup will be even slower). >> >>You could try the --no-fsync option to improve speed: >> >> --fsync, --no-fsync [opt] do (or not) often sync the file system >>(_not_ doing it is faster but can be dangerous) >> >>And, yes, the `rdiff-backup-data/increments` directory is used by >>rdiff-backup to keep track of file and directory changes. >> >>Hope this helps, >>Eric >> >>On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: >>> >>> Hi rdiff-backup folks, >>> >>> Since recent, during backing up I can see spike in IOPS up to 500 which >>> exhaust limit of a VM. Therefore backup process takes very long. I've >>> straced a bit and what I can see is: many failed lstat() syscalls: >>> % time seconds usecs/call calls errors syscall >>> -- --- --- - - >>> 42.71 0.040247 9 4608 1420 lstat >>> 35.41 0.033370 12 2860 getdents >>> 9.41 0.008865 6 1431 open >>> 4.63 0.004363 3 1430 close >>> 4.03 0.003797 3 1431 fstat >>> 3.75 0.003536 2 1417 getuid >>> 0.04 0.39 39 1 unlink >>> 0.01 0.13 1 9 read >>> -- --- --- - - >>> 100.00 0.094230 13187 1420 total >>> Seems rdiff-backup checks existence of some file/dir: >>> 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", >>> 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> >>> After backup is done, there is still no such file. >>> Seems the part in path - /rdiff-backup-data/increments/ - is some "config" >>> for rdiff-backup and probably it tryies to find something but can't? >>> >>> What might be wrong in my setup? What would you recommend to check to solve >>> the issue if it is issue at all? >>> >>> --- >>> Best Regards, >>> Andrei Enshin >>> > > >--- >Best Regards, >Andrei Enshin > --- Best Regards, Andrei Enshin
Re[4]: too many lstat() syscalls, therefore too many IOPS
Okay, seems I can see the reason of such behavior. Sorry for disturbing with such questions. We do run backup every 4 hours and seems there is 7200 seconds timeout. It means rdiff-backup will be killed and then we will run it again with `--check-destination-dir` option which causes very intensive disk usage by doing a lot of lstat(). That is my current understanding. Now it is still unclear to me why the ` --check-destination-dir` does so many lstat() and why it fails on checking some dir: Exception '[Errno 2] No such file or directory: '/some/path/rdiff-backup-data/increments/foo/aa.2021-05-12T04:15:01Z.dir'' raised of class '': >Среда, 12 мая 2021, 23:01 +09:00 от Andrei Enshin : > >Seems it does a lot of lstat() during run with option `--check-destination-dir` > >Which is fallback in case backup can’t be finished. Hm. > > >>Среда, 12 мая 2021, 22:44 +09:00 от Andrei Enshin < and.ens...@gmail.com >: >> >>Hi, >> >>Thank you for the explanation. >> >>During backup rdiff-backup did lstat for >>/some/path/rdiff-backup-data/increments/foo/bar >>which returned — ENOENT . >> >>Does it mean it tried to check some file in increments which is not here? >>If it is not in increments, does it mean it was never backed up? >>If all above statements are true, why after backup is done, there is still no >>such file? Is it expected? >> >> >>I’ve just played a bit with rdiff-backup on my local. >> >># at /tmp/tmp.jondxmEQDC >>$ cat > a >>aaa >>^D >> >># at /tmp/tmp.zh49h057dq $ mkdir bckp >>$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >>$ ls bckp/rdiff-backup-data/increments/ >># empty >> >> >>It means after very first backup there is nothing in increments. Let’s add >>new file and do backup once again: >> >># at /tmp/tmp.jondxmEQDC >>$ cat > b >>bbb >>^D >># at /tmp/tmp.zh49h057dq >>$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >>$ ls bckp/rdiff-backup-data/increments/ >>b.2021-05-12T22:11:02+09:00.missing >> >> >>I can see a record for new file with .missing suffix. >> >>However in case of `lstat()` it tries to access something which has not such >>suffix. >>What it tries to access? >> >> >>>Среда, 12 мая 2021, 20:02 +09:00 от Eric L. Zolf < ewl+rdiffbac...@lavar.de : >>> >>>Hi, >>> >>>first, I don't see anything surprising in what you describe, so all >>>normal AFAICJ. >>> >>>Second, rdiff-backup needs to check each source file/directory and each >>>target, compare them and then copy (or not), so if you have some 2300 >>>files to backup, that would sound about right. If the target or the >>>source file doesn't exist, it would give an error. >>> >>>If the files are small or don't have changes, the lstat happen a lot and >>>nothing much else; this is typical random access. It gives a much >>>different access pattern than the copying of bigger files, where more >>>sequential is typically done to read/write the file's data. >>> >>>There is no real way to improve the situation, rdiff-backup goes as fast >>>as it can and I personally don't know an I/O-equivalent of "nice" (and >>>if you limit the I/O, the backup will be even slower). >>> >>>You could try the --no-fsync option to improve speed: >>> >>> --fsync, --no-fsync [opt] do (or not) often sync the file system >>>(_not_ doing it is faster but can be dangerous) >>> >>>And, yes, the `rdiff-backup-data/increments` directory is used by >>>rdiff-backup to keep track of file and directory changes. >>> >>>Hope this helps, >>>Eric >>> >>>On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: Hi rdiff-backup folks, Since recent, during backing up I can see spike in IOPS up to 500 which exhaust limit of a VM. Therefore backup process takes very long. I've straced a bit and what I can see is: many failed lstat() syscalls: % time seconds usecs/call calls errors syscall -- --- --- - - 42.71 0.040247 9 4608 1420 lstat 35.41 0.033370 12 2860 getdents 9.41 0.008865 6 1431 open 4.63 0.004363 3 1430 close 4.03 0.003797 3 1431 fstat 3.75 0.003536 2 1417 getuid 0.04 0.39 39 1 unlink 0.01 0.13 1 9 read -- --- --- - - 100.00 0.094230 13187 1420 total Seems rdiff-backup checks existence of some file/dir: 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> After backup is done, there is still no such file. Seems the part in path - /rdiff-backup-data/increments/ - is some "config" for rdiff-backup and probably it tryies to find something but can't? What might be wrong in my setup? What would you recommend to check to solve the issue if it is issue at all? --- Best Regards, Andrei Enshin >> >> >>--- >>Best Regards, >>Andrei Enshin >> > > >--- >Best Regards, >Andrei Enshin > --- Best
Re[3]: too many lstat() syscalls, therefore too many IOPS
Seems it does a lot of lstat() during run with option `--check-destination-dir` Which is fallback in case backup can’t be finished. Hm. >Среда, 12 мая 2021, 22:44 +09:00 от Andrei Enshin : > >Hi, > >Thank you for the explanation. > >During backup rdiff-backup did lstat for >/some/path/rdiff-backup-data/increments/foo/bar >which returned — ENOENT . > >Does it mean it tried to check some file in increments which is not here? >If it is not in increments, does it mean it was never backed up? >If all above statements are true, why after backup is done, there is still no >such file? Is it expected? > > >I’ve just played a bit with rdiff-backup on my local. > ># at /tmp/tmp.jondxmEQDC >$ cat > a >aaa >^D > ># at /tmp/tmp.zh49h057dq $ mkdir bckp >$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >$ ls bckp/rdiff-backup-data/increments/ ># empty > > >It means after very first backup there is nothing in increments. Let’s add new >file and do backup once again: > ># at /tmp/tmp.jondxmEQDC >$ cat > b >bbb >^D ># at /tmp/tmp.zh49h057dq >$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ >$ ls bckp/rdiff-backup-data/increments/ >b.2021-05-12T22:11:02+09:00.missing > > >I can see a record for new file with .missing suffix. > >However in case of `lstat()` it tries to access something which has not such >suffix. >What it tries to access? > > >>Среда, 12 мая 2021, 20:02 +09:00 от Eric L. Zolf < ewl+rdiffbac...@lavar.de >: >> >>Hi, >> >>first, I don't see anything surprising in what you describe, so all >>normal AFAICJ. >> >>Second, rdiff-backup needs to check each source file/directory and each >>target, compare them and then copy (or not), so if you have some 2300 >>files to backup, that would sound about right. If the target or the >>source file doesn't exist, it would give an error. >> >>If the files are small or don't have changes, the lstat happen a lot and >>nothing much else; this is typical random access. It gives a much >>different access pattern than the copying of bigger files, where more >>sequential is typically done to read/write the file's data. >> >>There is no real way to improve the situation, rdiff-backup goes as fast >>as it can and I personally don't know an I/O-equivalent of "nice" (and >>if you limit the I/O, the backup will be even slower). >> >>You could try the --no-fsync option to improve speed: >> >> --fsync, --no-fsync [opt] do (or not) often sync the file system >>(_not_ doing it is faster but can be dangerous) >> >>And, yes, the `rdiff-backup-data/increments` directory is used by >>rdiff-backup to keep track of file and directory changes. >> >>Hope this helps, >>Eric >> >>On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: >>> >>> Hi rdiff-backup folks, >>> >>> Since recent, during backing up I can see spike in IOPS up to 500 which >>> exhaust limit of a VM. Therefore backup process takes very long. I've >>> straced a bit and what I can see is: many failed lstat() syscalls: >>> % time seconds usecs/call calls errors syscall >>> -- --- --- - - >>> 42.71 0.040247 9 4608 1420 lstat >>> 35.41 0.033370 12 2860 getdents >>> 9.41 0.008865 6 1431 open >>> 4.63 0.004363 3 1430 close >>> 4.03 0.003797 3 1431 fstat >>> 3.75 0.003536 2 1417 getuid >>> 0.04 0.39 39 1 unlink >>> 0.01 0.13 1 9 read >>> -- --- --- - - >>> 100.00 0.094230 13187 1420 total >>> Seems rdiff-backup checks existence of some file/dir: >>> 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", >>> 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> >>> After backup is done, there is still no such file. >>> Seems the part in path - /rdiff-backup-data/increments/ - is some "config" >>> for rdiff-backup and probably it tryies to find something but can't? >>> >>> What might be wrong in my setup? What would you recommend to check to solve >>> the issue if it is issue at all? >>> >>> --- >>> Best Regards, >>> Andrei Enshin >>> > > >--- >Best Regards, >Andrei Enshin > --- Best Regards, Andrei Enshin
Re[2]: too many lstat() syscalls, therefore too many IOPS
Hi, Thank you for the explanation. During backup rdiff-backup did lstat for /some/path/rdiff-backup-data/increments/foo/bar which returned — ENOENT . Does it mean it tried to check some file in increments which is not here? If it is not in increments, does it mean it was never backed up? If all above statements are true, why after backup is done, there is still no such file? Is it expected? I’ve just played a bit with rdiff-backup on my local. # at /tmp/tmp.jondxmEQDC $ cat > a aaa ^D # at /tmp/tmp.zh49h057dq $ mkdir bckp $ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ $ ls bckp/rdiff-backup-data/increments/ # empty It means after very first backup there is nothing in increments. Let’s add new file and do backup once again: # at /tmp/tmp.jondxmEQDC $ cat > b bbb ^D # at /tmp/tmp.zh49h057dq $ rdiff-backup /tmp/tmp.jondxmEQDC bckp/ $ ls bckp/rdiff-backup-data/increments/ b.2021-05-12T22:11:02+09:00.missing I can see a record for new file with .missing suffix. However in case of `lstat()` it tries to access something which has not such suffix. What it tries to access? >Среда, 12 мая 2021, 20:02 +09:00 от Eric L. Zolf : > >Hi, > >first, I don't see anything surprising in what you describe, so all >normal AFAICJ. > >Second, rdiff-backup needs to check each source file/directory and each >target, compare them and then copy (or not), so if you have some 2300 >files to backup, that would sound about right. If the target or the >source file doesn't exist, it would give an error. > >If the files are small or don't have changes, the lstat happen a lot and >nothing much else; this is typical random access. It gives a much >different access pattern than the copying of bigger files, where more >sequential is typically done to read/write the file's data. > >There is no real way to improve the situation, rdiff-backup goes as fast >as it can and I personally don't know an I/O-equivalent of "nice" (and >if you limit the I/O, the backup will be even slower). > >You could try the --no-fsync option to improve speed: > > --fsync, --no-fsync [opt] do (or not) often sync the file system >(_not_ doing it is faster but can be dangerous) > >And, yes, the `rdiff-backup-data/increments` directory is used by >rdiff-backup to keep track of file and directory changes. > >Hope this helps, >Eric > >On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: >> >> Hi rdiff-backup folks, >> >> Since recent, during backing up I can see spike in IOPS up to 500 which >> exhaust limit of a VM. Therefore backup process takes very long. I've >> straced a bit and what I can see is: many failed lstat() syscalls: >> % time seconds usecs/call calls errors syscall >> -- --- --- - - >> 42.71 0.040247 9 4608 1420 lstat >> 35.41 0.033370 12 2860 getdents >> 9.41 0.008865 6 1431 open >> 4.63 0.004363 3 1430 close >> 4.03 0.003797 3 1431 fstat >> 3.75 0.003536 2 1417 getuid >> 0.04 0.39 39 1 unlink >> 0.01 0.13 1 9 read >> -- --- --- - - >> 100.00 0.094230 13187 1420 total >> Seems rdiff-backup checks existence of some file/dir: >> 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", >> 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> >> After backup is done, there is still no such file. >> Seems the part in path - /rdiff-backup-data/increments/ - is some "config" >> for rdiff-backup and probably it tryies to find something but can't? >> >> What might be wrong in my setup? What would you recommend to check to solve >> the issue if it is issue at all? >> >> --- >> Best Regards, >> Andrei Enshin >> --- Best Regards, Andrei Enshin
Re: too many lstat() syscalls, therefore too many IOPS
On Wed, 2021-05-12 at 13:01 +0200, Eric L. Zolf wrote: ... > > There is no real way to improve the situation, rdiff-backup goes as > fast > as it can and I personally don't know an I/O-equivalent of "nice" > (and > if you limit the I/O, the backup will be even slower). The I/O equivalent of "nice" is "ionice" and you probably want to set it to the "idle" class, i.e. "ionice -c idle rdiff-backup ..." This will run when other activities are not doing I/O, and yes it will run slower, on a system that is doing other I/O tasks. Regards Frank
Re: too many lstat() syscalls, therefore too many IOPS
Hi, first, I don't see anything surprising in what you describe, so all normal AFAICJ. Second, rdiff-backup needs to check each source file/directory and each target, compare them and then copy (or not), so if you have some 2300 files to backup, that would sound about right. If the target or the source file doesn't exist, it would give an error. If the files are small or don't have changes, the lstat happen a lot and nothing much else; this is typical random access. It gives a much different access pattern than the copying of bigger files, where more sequential is typically done to read/write the file's data. There is no real way to improve the situation, rdiff-backup goes as fast as it can and I personally don't know an I/O-equivalent of "nice" (and if you limit the I/O, the backup will be even slower). You could try the --no-fsync option to improve speed: --fsync, --no-fsync [opt] do (or not) often sync the file system (_not_ doing it is faster but can be dangerous) And, yes, the `rdiff-backup-data/increments` directory is used by rdiff-backup to keep track of file and directory changes. Hope this helps, Eric On 12/05/2021 07:10, Andrei Enshin via Any discussion of rdiff-backup wrote: Hi rdiff-backup folks, Since recent, during backing up I can see spike in IOPS up to 500 which exhaust limit of a VM. Therefore backup process takes very long. I've straced a bit and what I can see is: many failed lstat() syscalls: % time seconds usecs/call callserrors syscall -- --- --- - - 42.710.040247 9 4608 1420 lstat 35.410.033370 12 2860 getdents 9.410.008865 6 1431 open 4.630.004363 3 1430 close 4.030.003797 3 1431 fstat 3.750.003536 2 1417 getuid 0.040.39 39 1 unlink 0.010.13 1 9 read -- --- --- - - 100.000.094230 13187 1420 total Seems rdiff-backup checks existence of some file/dir: 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> After backup is done, there is still no such file. Seems the part in path - /rdiff-backup-data/increments/ - is some "config" for rdiff-backup and probably it tryies to find something but can't? What might be wrong in my setup? What would you recommend to check to solve the issue if it is issue at all? --- Best Regards, Andrei Enshin
too many lstat() syscalls, therefore too many IOPS
Hi rdiff-backup folks, Since recent, during backing up I can see spike in IOPS up to 500 which exhaust limit of a VM. Therefore backup process takes very long. I've straced a bit and what I can see is: many failed lstat() syscalls: % time seconds usecs/call callserrors syscall -- --- --- - - 42.710.040247 9 4608 1420 lstat 35.410.033370 12 2860 getdents 9.410.008865 6 1431 open 4.630.004363 3 1430 close 4.030.003797 3 1431 fstat 3.750.003536 2 1417 getuid 0.040.39 39 1 unlink 0.010.13 1 9 read -- --- --- - - 100.000.094230 13187 1420 total Seems rdiff-backup checks existence of some file/dir: 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar", 0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.20> After backup is done, there is still no such file. Seems the part in path - /rdiff-backup-data/increments/ - is some "config" for rdiff-backup and probably it tryies to find something but can't? What might be wrong in my setup? What would you recommend to check to solve the issue if it is issue at all? --- Best Regards, Andrei Enshin