Hi,
I'm working on OAK-8105 which is to update AzureDataStore to use the new
Azure v12 SDK instead of the deprecated v8 SDK, and may have run into a
snag where I could use some input from the team.
The main issue: Current cloud data store implementations (Azure and S3)
have the following behavior: When a client tries to write a binary that
already exists in blob storage, instead of writing the binary, the existing
binary has the last-modified time updated and a record for the existing
binary is returned as the result. The question is: What would be the
impact if we were unable to update the last-modified time in this situation?
Background: AzureDataStore currently allows authentication/authorization
to the Azure storage service two different ways. One is via an access key
- essentially a shared secret created by the storage service. The other
way is via a shared access signature, which can be generated via an API
call. Importantly we don't use "both" in a single instance - we use the
access key if it is provided, and otherwise use the shared access signature.
Azure's API does not allow modifying the last-modified property of a blob
directly. To do this up until now we have issued a service-side blob copy
instruction to copy the blob to itself, which has the effect of updating
the last-modified value.
However, with the new Azure SDK, based on my testing there are certain API
operations that you cannot perform when you authenticate with a shared
access signature. One of these actions you cannot perform is a
service-side blob copy. I am working with Microsoft directly to try to
find a workaround, but if my testing is correct we may not be able to
update the last-modified value in the situation of writing an already
existing binary, if a shared access signature is used to authenticate.
(It is possible this never worked with the old SDK either; I don't think
that particular behavior was ever tested using a shared access signature
before today.)
If we cannot find a workaround I see the following options:
- Don't update the last-modified value if we authenticate using a shared
access signature. (Or don't worry about it at all if it doesn't actually
matter - but I assume it does matter.)
- Don't allow authentication/authorization with shared access signatures
for AzureDataStore. (This would potentially break existing implementations
that are using this method to authenticate.)
Sorry for the long email, but I thought the full context was necessary.
Open to thoughts on this.
-MR