On Saturday, October 15, 2016 at 9:29:25 AM UTC-5, Mike Beaubien wrote:
>
> File "/usr/lib/s3ql/s3ql/backends/local.py", line 245, in _read_meta
> buf = fh.read(9)
> OSError: [Errno 70] Communication error on send
>
> It's probably just some temporary error for whatever network reason. Is
> there anyway to get s3ql to ignore and retry on these errors?
>
The recent S3 outage seemed to trigger this error much more frequently when
using s3ql with acd_cli and overlayfs. Out of curiosity and for testing
purposes, I added a retry loop to local.py, replacing buf = fh.read(9):
Line 20:
import time
Line 242:
def _read_meta(fh):
while True:
try:
buf = fh.read(9)
except OSError as e:
if e.errno in (os.errno.ECOMM, os.errno.EFAULT):
log.info('OSError: %s, retrying' % e)
time.sleep(1)
continue
break
ECOMM error 70 "Communication error on send" (more frequent) and EFAULT
error 14 "Bad address" (rare) are the only errors I've seen so far using
s3ql and acd_cli together. The 1s sleep interval was to prevent hammering
ACD but in practice it hasn't come into play - logs show the retries
occurring on wider intervals. It'd be more ideal to use the retry code
that already exists in s3ql with the exponentially increasing interval.
Very crude, but so far it's been working well - the filesystem has yet to
crash after a few days of testing. As of lately while streaming video, the
retry occurs at varying intervals - typically 20-45min apart, rarely 10-30s
apart for a couple of minutes. Playback is fine with infrequent retries,
but will buffer briefly if retries are repeated within a few minutes and
continue normally after buffering (rare).
A real ACD backend would be ideal but for now this has prevented crashes
and having to wait a couple of hours for fsck to complete. On a side note,
there is one advantage to this combination of s3ql, acd_cli, and overlayfs
with Plex - media files can be stored locally first via the upper layer of
overlayfs and given time for Plex to perform deep analysis of the files.
The files can be uploaded at will once the analysis is complete - the
analysis would take much longer and chew up bandwidth if the files were
immediately stored on ACD.
For those willing to experiment, hope this helps!
-Nikhil
--
You received this message because you are subscribed to the Google Groups
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.