Hi Peter,

Thanks for following up. Looking at
https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.html, it
looks like there is a bug in S3QL: The S3 backend expects the ETag to
match the MD5 of the content.

This hasn't been a problem so far because when S3QL itself uploads the
objects, this is the case. But when you're modifying objects with an
external tool, this assumption no longer holds.

I'm not sure how to best fix it. One way would be to just not verify the
content. As long as encryption is being used, it will detect any
corruption. However, for un-encrypted buckets this could result in
undetected corruption.

The above page talks about the "algorithm that was used to create a
checksum of an object", which seems to be what we want. However, there
is no mention of an actual checksum other than the ETag (which seemingly
cannot be validated by the client). Does anyone know if Amazon provides
other checksums that could be used (e.g. Content-MD5).

Best,
-Nikolaus



On Jun 14 2023, Peter Marshall <[email protected]> wrote:
> I've experimented, and it's not due to encryption, so I presume it is related 
> to multipart
> during the AWS backup or restore process.
>
> e.g. original bucket, with mountable s3ql:
>
> object ncs3ql_data_1 has ETag 6af237cd6aa167ec276eb58f9f9a52c6
>
> Same object on restored bucket: ETag dc3d145a13fa955d024aaa4165826530-1
>
> If I download the file from the restored bucket, and run md5 on it, I get 
> back the
> original ETag, so it is the same data.
>
> If I try fsck on the restored bucket, it's v not happy:
>
> WARNING: MD5 mismatch for s3ql_passphrase: b428f7203f2bfd8b547be1ade86a74a3-1 
> vs
> 0c8d069f75210a79332014f7cb38454a
> WARNING: MD5 mismatch for s3ql_passphrase: b428f7203f2bfd8b547be1ade86a74a3-1 
> vs
> 0c8d069f75210a79332014f7cb38454a
> WARNING: MD5 mismatch for s3ql_passphrase: b428f7203f2bfd8b547be1ade86a74a3-1 
> vs
> 0c8d069f75210a79332014f7cb38454a
> Encountered BadDigestError (BadDigest: ETag header does not agree with 
> calculated MD5),
> retrying Backend.perform_read (attempt 3)...
> WARNING: MD5 mismatch for s3ql_passphrase: b428f7203f2bfd8b547be1ade86a74a3-1 
> vs
> 0c8d069f75210a79332014f7cb38454a
> Encountered BadDigestError (BadDigest: ETag header does not agree with 
> calculated MD5),
> retrying Backend.perform_read (attempt 4)...
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "s3ql" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/s3ql/00dfa783-12e1-b8e9-85ca-d7809b44e029%40goteck.co.uk.

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/s3ql/87o7lgn1hw.fsf%40vostro.rath.org.

Reply via email to