Hello everyone,
To be sure - I read the section about this in the documentation, but I just
want to clarify, concretely and based on perhaps peoples experiences and
those with the implementation knowledge of S3QL, specifically on what
should be done to ensure the data integrity of a S3QL filesystem on an
Amazon S3 backend.
(As a note - to the S3QL documentation editor - according to Amazon, the
"Standard Region" now offers immediate consistency ("read after write") as
of June 19, 2015. You may update your documentation to reflect this if you
like as the current documentation reflects that it is "Unknown" and not
immediate. For more, check out
https://forums.aws.amazon.com/ann.jspa?annID=3112).
>From what I understand, for optimal data preservation and access, I want a
service provider with an immediate consistency window (which Amazon S3 now
provides across all regions) as well as high durability of objects. I
believe, and correct me if I'm wrong, that Amazon S3 satisfies these
requirements quite well with a high durability and the instant
read-after-write consistency. (I have chosen to use the reduced redundancy
option to save money as I also store the data on both internal disks and an
external disk locally, except for older backups, and plan on replicating
the data ALSO to Google Cloud for extra redundancy / insurance.)
The documentation makes clear that in order to verify file integrity like
any other file system I should run S3QL's "fsck" program, "fsck.s3ql", and,
from time to time, "s3ql_verify". So, my main concern per this e-mail:
In definite terms, for ensuring the best consistency of my data integrity,
how often should I run per-filesystem (I have 3 now) fsck.s3ql, and how
often should I run s3ql_verify? How long would s3ql_verify take,
approximately, on a filesystem approximately 2TB in size? Are objects
pulled - that is, can I expect to pay for data object retrieval for each
invocation of s3ql_verify? If an object is corrupt and s3ql_verify deletes
it, and then fsck.s3ql detects the missing object, do I need a copy of the
object on the local filesystem for it to duplicate from or does it have
some kind of backup of that object somewhere on the S3QL fs?
Please excuse me as I am not too familiar with filesystems under-the-hood,
but I'd like to know the best practices for maintaining a S3QL filesystem
for long periods of time as I am storing quite a lot of our data with it.
Thank you for the help, and thank you Mr. Rath for excellent FREE software!
Warm regards,
Brandon
--
You received this message because you are subscribed to the Google Groups
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.