On Dec 27 2020, Ivan Shapovalov <[email protected]> wrote: > Full mount.s3ql invocation is: > > -- 8< -- > /usr/bin/mount.s3ql b2://<mybucket> /mnt/b2/files -o > systemd,log=none,authfile=/etc/s3ql/authinfo2,cachedir=/var/tmp/s3ql,debug,allow-other,compress=none,cachesize=10485760,threads=8,keep-cache,backend-options=disable-versions > -- 8< --
Please also run this under gdb: $ gdb python3 > run /usr/bin/mount.s3ql b2://<mybucket> /mnt/b2/files -o > systemd,log=none,authfile=/etc/s3ql/authinfo2,cachedir=/var/tmp/s3ql,debug,allow-other,compress=none,cachesize=10485760,threads=8,keep-cache,backend-options=disable-versions When the process terminates, run: > thread apply all bt > > Full s3ql log (captured stdout/stderr) is available here: > https://intelfx.name/files/persistent/2020-12-27%20s3ql/s3ql.2.log.zst > https://intelfx.name/files/persistent/2020-12-27%20s3ql/s3ql.3.log.zst > (caution -- each log is ~150 MB unpacked, which is a separate concern > of mine but irrelevant to this problem.) > > Manual excerpt from s3ql log relevant to the block that failed: > -- 8< -- > mount.s3ql[117923]: 2020-12-27 13:02:03.135 117923 DEBUG MainThread > s3ql.block_cache.upload_if_dirty: started with <Dirty CacheEntry, > inode=11545, blockno=109> Debug logging won't help with this problem, you can safely disable that. > Strangely, systemd tells me that mount.s3ql does not just exit > normally, but is terminated by SIGTERM: > > -- 8< -- > $ systemctl status [email protected] > ● [email protected] - s3ql file system at /mnt/b2/files > Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor > preset: disabled) > Active: inactive (dead) since Sun 2020-12-27 13:02:05 MSK; 1h 27min ago > Process: 117923 ExecStart=/usr/bin/mount.s3ql ${What} /mnt/b2/files -o > systemd,log=none,authfile=/etc/s3ql/authinfo2,cachedir=/var/tmp/s3ql,${Options} > (code=killed, signal=TERM) > Process: 135718 ExecStop=/usr/bin/umount /mnt/b2/files (code=exited, > status=32) > Main PID: 117923 (code=killed, signal=TERM) > IP: 40.5M in, 8.5G out > CPU: 6min 11.868s > -- 8< -- What does your kernel log say at this time (dmesg)? Could it be that you're running out of memory, and the OOM killer is killing mount.s3ql to free up memory? The TERM signal does not make sense to me, to this a non-fatal signal that should result in S3QL gracefully exiting. Could you try what happens when you manually send SIGTERM to a running mount.s3ql process? Does it terminate properly with full logging until the end? > mount.s3ql[117923]: NoneType: None > > I'd gladly go debug it myself, but I don't know where to start with > this async stuff. Any pointers? How do I read this log? What is this > strange "NoneType: None"? This doesn't make any sense to me either. Normally I would suggest to add e.g. line numbers and file names to https://github.com/s3ql/s3ql/blob/master/src/s3ql/logging.py#L86 to see where this message is generated, but looking at it it seems that it should already inculde the function name.. So whatever logs this is bypassing the regular logging code path. So, in summary: - Run standalone under gdb (and not as a systemd service) - Check kernel logs - Check memory usage - Try to send SIGTERM to a non-problematic mount Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/s3ql/87blefwk0s.fsf%40vostro.rath.org.
