Tellier Benoit created JAMES-2852:
-------------------------------------
Summary: Optimizing CassandraBlobStore deleteBucket
Key: JAMES-2852
URL: https://issues.apache.org/jira/browse/JAMES-2852
Project: James Server
Issue Type: Improvement
Components: Blob, cassandra
Reporter: Tellier Benoit
Currently CassandraBlobStore needs to iterate on all blobs of a current bucket
in order to delete a bucket.
This was our design considerations:
We avoided "wide row" issue - many blobs being stored in the same buckets the
maximum size of a cell would have been exceeded - and optimize data repartition
in a cluster. For these reasons, we had to choose a primary key that has a
finner granularity than just the bucket - we choosed to rely on the bucket and
the object identifier. This leads to a slow operation upon deleting bucket as
all blobns not in default bucket needs to be iterated on.
The only usage so far is the vault, which currently relies on 13 buckets, hence
the over-head introduced is reasonable.
However, this cost will increase as we expand our usage of buckets.
Later on, we could introduce a time serie for retrieving easily blobs stored in
a bucket and avoiding iterating non related blobs.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]