Tellier Benoit created JAMES-2852:
-------------------------------------

             Summary: Optimizing CassandraBlobStore deleteBucket
                 Key: JAMES-2852
                 URL: https://issues.apache.org/jira/browse/JAMES-2852
             Project: James Server
          Issue Type: Improvement
          Components: Blob, cassandra
            Reporter: Tellier Benoit


Currently CassandraBlobStore needs to iterate on all blobs of a current bucket 
in order to delete a bucket.

This was our design considerations:

We avoided "wide row" issue - many blobs being stored in the same buckets the 
maximum size of a cell would have been exceeded - and optimize data repartition 
in a cluster. For these reasons, we had to choose a primary key that has a 
finner granularity than just the bucket - we choosed to rely on the bucket and 
the object identifier. This leads to a slow operation upon deleting bucket as 
all blobns not in default bucket needs to be iterated on.

The only usage so far is the vault, which currently relies on 13 buckets, hence 
the over-head introduced is reasonable.

However, this cost will increase as we expand our usage of buckets.

Later on, we could introduce a time serie for retrieving easily blobs stored in 
a bucket and avoiding iterating non related blobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to