Hi,
I'm using WAL-E 0.8.1 on AWS (EC2 PostgreSQL servers; S3 bucket). Each time
I test a backup-fetch, I get a handful of timeout messages for individual
volumes, like this:
wal_e.worker.s3.s3_worker INFO MSG: beginning partition download
DETAIL: The partition being downloaded is part_00000057.tar.lzo.
HINT: The absolute S3 key is
basebackups_005/base_0000000200005B950000008F_00000032/tar_partitions/part_00000057.tar.lzo.
STRUCTURED: time=2016-01-05T02:40:21.666194-00 pid=32468
lzop: Invalid argument: <stdin>
wal_e.retries WARNING MSG: retrying after encountering exception
DETAIL: Exception information dump:
Traceback (most recent call last):
File
"/home/wal-e/venv/local/lib/python2.7/site-packages/wal_e/retries.py", line
62, in shim
return f(*args, **kwargs)
File
"/home/wal-e/venv/local/lib/python2.7/site-packages/wal_e/worker/s3/s3_worker.py",
line 82, in fetch_partition
raise exc
SSLError: The read operation timed out
HINT: A better error message should be written to handle this
exception. Please report this output and, if possible, the situation under
which it arises.
STRUCTURED: time=2016-01-05T02:40:36.548270-00 pid=32468
The set of problem volumes varies from one backup-fetch invocation to
another, I guess due to transitive network issues.
Anyway, WAL-E doesn't retry fetching these volumes. Is there a way to make
it do so? Is there a fix or workaround for this?
Thanks,
Quinn
--
You received this message because you are subscribed to the Google Groups
"wal-e" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.