BUILD FAILURE: Jackrabbit Oak - Build # 2417 - Failure

2019-10-02 Thread Apache Jenkins Server
The Apache Jenkins build system has built Jackrabbit Oak (build #2417)

Status: Failure

Check console output at https://builds.apache.org/job/Jackrabbit%20Oak/2417/ to 
view the results.

Changes:
No changes
 

Test results:
No tests ran.<>


Re: [DISCUSS] Impact of not updating last-modified time when writing a blob that already exists?

2019-10-02 Thread Marcel Reutegger
Hi,

On 02.10.19, 09:22, "Matt Ryan"  wrote:
> The question is:  What would be the impact if we were unable to
> update the last-modified time in this situation?

I think the update is important for the datastore garbage collection
to work correctly. There was an issue with the MongoBlobStore
implementation a while ago that had a very similar problem:
https://issues.apache.org/jira/browse/OAK-7389

According to comments in the issue, the datastore garbage collector
may remove such blobs.

Regards
 Marcel



[DISCUSS] Impact of not updating last-modified time when writing a blob that already exists?

2019-10-02 Thread Matt Ryan
Hi,

I'm working on OAK-8105 which is to update AzureDataStore to use the new
Azure v12 SDK instead of the deprecated v8 SDK, and may have run into a
snag where I could use some input from the team.

The main issue:  Current cloud data store implementations (Azure and S3)
have the following behavior:  When a client tries to write a binary that
already exists in blob storage, instead of writing the binary, the existing
binary has the last-modified time updated and a record for the existing
binary is returned as the result.  The question is:  What would be the
impact if we were unable to update the last-modified time in this situation?

Background:  AzureDataStore currently allows authentication/authorization
to the Azure storage service two different ways.  One is via an access key
- essentially a shared secret created by the storage service.  The other
way is via a shared access signature, which can be generated via an API
call.  Importantly we don't use "both" in a single instance - we use the
access key if it is provided, and otherwise use the shared access signature.

Azure's API does not allow modifying the last-modified property of a blob
directly.  To do this up until now we have issued a service-side blob copy
instruction to copy the blob to itself, which has the effect of updating
the last-modified value.

However, with the new Azure SDK, based on my testing there are certain API
operations that you cannot perform when you authenticate with a shared
access signature.  One of these actions you cannot perform is a
service-side blob copy.  I am working with Microsoft directly to try to
find a workaround, but if my testing is correct we may not be able to
update the last-modified value in the situation of writing an already
existing binary, if a shared access signature is used to authenticate.

(It is possible this never worked with the old SDK either; I don't think
that particular behavior was ever tested using a shared access signature
before today.)


If we cannot find a workaround I see the following options:
- Don't update the last-modified value if we authenticate using a shared
access signature.  (Or don't worry about it at all if it doesn't actually
matter - but I assume it does matter.)
- Don't allow authentication/authorization with shared access signatures
for AzureDataStore.  (This would potentially break existing implementations
that are using this method to authenticate.)


Sorry for the long email, but I thought the full context was necessary.
Open to thoughts on this.


-MR