I have two large (multi-TB) filesystems running on S3QL. One of them 
contains a number of large files (several hundred GB). Unfortunately the 
filesystem has crashed a number of times over the last few months. Recently 
these crashes are becoming more and more frequent - several times a day. I 
think I've finally identified the cause, although I don't have a good 
resolution.

It seems that if the filesystem discovers that it is deleting a file, but a 
block supposedly belonging to that file does not exist, the filesystem 
crashes in a heap. Normally I'd say that the correct resolution would be to 
run the filesystem checker in full verification mode, and maybe that's 
where I'll end up heading if I can't get a better fix. However, in this 
particular use case I don't think the filesystem should crash. It could 
write a warning to the log file, but because a file is being deleted I 
don't believe it should really matter (too much) if the underlying data 
blocks are already missing.

Here's an example where the filesystem lasted less than 2 minutes from 
mount to crash:

2017-01-05 04:00:41.650 31164:MainThread s3ql.mount.main: Autodetected 10160 
file descriptors available for cache entries
2017-01-05 04:00:44.291 31164:MainThread s3ql.mount.get_metadata: Using 
cached metadata.
2017-01-05 04:00:44.589 31164:MainThread s3ql.mount.main: Mounting 
swiftks://auth.cloud.ovh.net/GRA1:Cloud/ 
at /var/autofs/misc/s3qlcloud...
2017-01-05 04:00:44.676 31269:MainThread s3ql.daemonize.
detach_process_context: Daemonizing, new PID is 31271
2017-01-05 04:01:34.100 31271:Thread-28 root.excepthook: Uncaught top-level 
exception:
Traceback (most recent call last):
  File "/usr/lib/s3ql/s3ql/backends/swift.py", line 394, in delete
    resp = self._do_request('DELETE', '/%s%s' % (self.prefix, key))
  File "/usr/lib/s3ql/s3ql/backends/swift.py", line 258, in _do_request
    raise HTTPError(resp.status, resp.reason, resp.headers)
s3ql.backends.s3c.HTTPError: 404 Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
    run_old(*args, **kw)
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/s3ql/s3ql/block_cache.py", line 692, in _removal_loop
    backend.delete_multi(['s3ql_data_%d' % i for i in ids])
  File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 295, in delete_multi
    return self.backend.delete_multi(keys, force=force)
  File "/usr/lib/s3ql/s3ql/backends/common.py", line 476, in delete_multi
    self.delete(key, force=force)
  File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
    return method(*a, **kw)
  File "/usr/lib/s3ql/s3ql/backends/swift.py", line 400, in delete
    raise NoSuchObject(key)
s3ql.backends.common.NoSuchObject: Backend does not have anything stored 
under key 's3ql_data_1879267'
2017-01-05 04:01:36.792 31271:Thread-26 s3ql.mount.exchook: Unhandled top-level 
exception during shutdown (will not be re-raised)
2017-01-05 04:01:36.794 31271:Thread-26 root.excepthook: Uncaught top-level 
exception:
Traceback (most recent call last):
  File "/usr/lib/s3ql/s3ql/backends/swift.py", line 394, in delete
    resp = self._do_request('DELETE', '/%s%s' % (self.prefix, key))
  File "/usr/lib/s3ql/s3ql/backends/swift.py", line 258, in _do_request
    raise HTTPError(resp.status, resp.reason, resp.headers)
s3ql.backends.s3c.HTTPError: 404 Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/s3ql/s3ql/mount.py", line 64, in run_with_except_hook
    run_old(*args, **kw)
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/s3ql/s3ql/block_cache.py", line 692, in _removal_loop
    backend.delete_multi(['s3ql_data_%d' % i for i in ids])
  File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 295, in delete_multi
    return self.backend.delete_multi(keys, force=force)
  File "/usr/lib/s3ql/s3ql/backends/common.py", line 476, in delete_multi
    self.delete(key, force=force)
  File "/usr/lib/s3ql/s3ql/backends/common.py", line 108, in wrapped
    return method(*a, **kw)
  File "/usr/lib/s3ql/s3ql/backends/swift.py", line 400, in delete
    raise NoSuchObject(key)
s3ql.backends.common.NoSuchObject: Backend does not have anything stored 
under key 's3ql_data_1721463'
2017-01-05 04:02:51.372 31271:MainThread s3ql.mount.unmount: Unmounting 
file system...



Thanks,
Chris

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to