Got a crash again - this time, after upgrading to 2.18, it was stable for quite a while. I had three simultaneous rsync's running, backing up in excess of 5TB of data over two buckets and it got to around 400GB or so before it crashed this time around. I've attached the relevant mount.log portion (the exception from Python at time of crash) below. I unmounted, ran an fsck, remounted, and everything is running smooth again for the time being. I figure you can look at the crash log and let me know what you think.
Thanks Brandon -- SNIP SNIP -- > 2016-06-19 10:33:15.849 15572:Thread-5 s3ql.backends.common.wrapped: > Encountered ConnectionClosed (found closed when trying to write), retrying > ObjectW.close (attempt 3)... > 2016-06-19 13:24:37.420 15572:Thread-8 s3ql.backends.common.wrapped: > Encountered ConnectionClosed (found closed when trying to write), retrying > ObjectW.close (attempt 3)... > 2016-06-19 14:29:32.596 15572:Thread-10 s3ql.backends.common.wrapped: > Encountered ConnectionClosed (found closed when trying to write), retrying > ObjectW.close (attempt 3)... > 2016-06-19 16:00:04.635 15572:Thread-12 root.excepthook: Uncaught > top-level exception: > Traceback (most recent call last): > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/mount.py", > > line 64, in run_with_except_hook > run_old(*args, **kw) > File "/usr/lib/python3.4/threading.py", line 868, in run > self._target(*self._args, **self._kwargs) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/block_cache.py", > > line 405, in _upload_loop > self._do_upload(*tmp) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/block_cache.py", > > line 432, in _do_upload > % obj_id).get_obj_size() > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/common.py", > > line 108, in wrapped > return method(*a, **kw) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/common.py", > > line 340, in perform_write > return fn(fh) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/comprenc.py", > > line 346, in __exit__ > self.close() > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/comprenc.py", > > line 340, in close > self.fh.close() > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/comprenc.py", > > line 505, in close > self.fh.close() > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/common.py", > > line 108, in wrapped > return method(*a, **kw) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/s3c.py", > > line 906, in close > headers=self.headers, body=self.fh) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/s3c.py", > > line 460, in _do_request > query_string=query_string, body=body) > File > "/usr/local/lib/python3.4/dist-packages/s3ql-2.18-py3.4-linux-x86_64.egg/s3ql/backends/s3c.py", > > line 695, in _send_request > headers=headers, body=BodyFollowing(body_len)) > File "/usr/local/lib/python3.4/dist-packages/dugong/__init__.py", line > 508, in send_request > self.timeout) > File "/usr/local/lib/python3.4/dist-packages/dugong/__init__.py", line > 1396, in eval_coroutine > if not next(crt).poll(timeout=timeout): > File "/usr/local/lib/python3.4/dist-packages/dugong/__init__.py", line > 603, in co_send_request > yield from self._co_send(buf) > File "/usr/local/lib/python3.4/dist-packages/dugong/__init__.py", line > 619, in _co_send > len_ = self._sock.send(buf) > File "/usr/lib/python3.4/ssl.py", line 678, in send > v = self._sslobj.write(data) > ssl.SSLError: [SSL: BAD_LENGTH] bad length (_ssl.c:1638) > 2016-06-19 16:08:22.636 15572:MainThread s3ql.mount.unmount: Unmounting > file system... > Enter code here... > > > > -- snip snip -- On Friday, June 17, 2016 at 4:10:55 PM UTC-5, Nikolaus Rath wrote: > > On Jun 17 2016, Brandon Orwell <[email protected] <javascript:>> wrote: > > I've been using S3QL for a few days now, and whenever I am copying over > > large amounts of data the mount point seems to 'lock up' for a period of > > time (as well as anything else trying to access it), and then I start > > getting 'transport endpoint not connected' errors. I "umount" the mount > > point, run fsck on it. and then continue archiving until the problem > > happens again. > > I've heard this kind of story before, but it still amazes me. What train > of thought led you to this procedure, instead of reporting the problem? > > > Does anyone know what would cause these problems? > > Where did you look for the answer? > > https://bitbucket.org/nikratio/s3ql/wiki/FAQ#!what-does-the-transport-endpoint-not-connected-error-mean > > says: > > ,---- > | What does the "Transport endpoint not connected" error mean? > | > | It means that the file system has crashed. Please check mount.log for a > | more useful error message and report a bug if appropriate. If you can't > | find any errors in mount.log, the mount process may have > | "segfaulted". To confirm this, look for a corresponding message in the > | dmesg output. If the mount process segfaulted, please try to obtain a C > | backtrace (see Providing Debugging Info) of the crash and file a bug > | report. > | > | To make the mountpoint available again (i.e., unmount the crashed file > system), use the fusermount -u command. > | > | Before reporting a bug, please make sure that you're not just using the > | most recent S3QL version, but also the most-recent version of the most > | important dependencies (python-llfuse, python-apsw, python-lzma). > `---- > > > Best, > -Nikolaus > > -- > GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F > Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F > > »Time flies like an arrow, fruit flies like a Banana.« > -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
