[ 
https://issues.apache.org/jira/browse/JAMES-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoit Tellier updated JAMES-3570:
----------------------------------
    Description: 
h3. What 

At Linagora, we deploy our James offer at OVH, which provides a Swift backed S3 
API.

Running the deduplicating blob store on such a setup, we encountered the 
following errors:
 - 400 errors with no messages upon PUT
 - NoSuckKeyExceptions being thrown by the S3 driver

We noticed these exceptions to be occasionally thrown by the post processing 
(event bus), for emails with several recipients. Post processing eventually 
succeeds.

In addition to unpleasant errors in the log, we wonder if two concurrent 
APPENDs could not lead to possible data corruption...

h3. Why

Swift do not implement strong read/write isolation. We did run the James 
BlobStore test suite against OVH S3 APIs, and encountered the same errors.

We thus link these errors to a default of isolation.

h3. How to fix

Our strategy so far is to implement 
https://github.com/apache/james-project/blob/master/src/adr/0015-objectstorage-blobid-list.md,
 but as part of a Linagora product built on top of James (because the following 
ADR had been controversial and rejected by a part of the community).

The idea behind this fix is to leverage Cassandra to avoid, most of the time, 
to store a blob twice.

We are not against contributing this mitigation under Apache, but we wish to 
fix it timely on our side first. Please show interest if we want this to be 
shared.

Amongst the alternatives we can consider:

 - To run on an alternative object storage (Zenko cloud server, used by our 
test suite for instance, AWS s3). 

 - To check the object store itself before APPENDs. It does not prevent data 
races (as the blob might be inserted by another thread between
the check and the append). Also previous experiments have shown it has a great 
cost on performance. CF https://github.com/linagora/james-project/pull/2011

 - To disable deduplication. This allows to effectively delete data (not yet 
the case for deduplication) but will result in data duplication and 
thus a storage cost increase. Can be done today via a configuration option...


 h3. Interaction with other features

An effort regarding implementing deduplication is currently on pause. BlobId 
will ship a notion of generation, we never delete blobs of the current 
generation
(and of a buffer generation to prevent clock skew), so we never delete blobs 
that we are appending, so by design the de-duplicated content removal will not 
interfere with this blobIdList mechanism.

This blobId list mechanism is scoped to the default bucket not to interfere 
with the DeletedMessageVault.


  was:
h3. What 

At Linagora, we deploy our James offer at OVH, which provides a Swift backed S3 
API.

Running the deduplicating blob store on such a setup, we encountered the 
following errors:
 - 400 errors with no messages upon PUT
 - NoSuckKeyExceptions being thrown by the S3 driver

We noticed these exceptions to be occasionally thrown by the post processing 
(event bus), for emails with several recipients. Post processing eventually 
succeeds.

In addition to unpleasant errors in the log, we wonder if two concurrent 
APPENDs could not lead to possible data corruption...

h3. Why

Swift do not implement strong read/write isolation. We did run the James 
BlobStore test suite against OVH S3 APIs, and encountered the same errors.

We thus link these errors to a default of isolation.

h3. How to fix

Our strategy so far is to implement 
https://github.com/apache/james-project/blob/master/src/adr/0015-objectstorage-blobid-list.md,
 but as part of a Linagora product built on top of James (because the following 
ADR had been controversial and rejected by a part of the community).

The idea behind this fix is to leverage Cassandra to avoid, most of the time, 
to store a blob twice.

We are not against contributing this mitigation under Apache, but we wish to 
fix it timely on our side first. Please show interest if we want this to be 
shared.

Amongst the alternatives we can consider:

 - To run on an alternative object storage (Zenko cloud server, used by our 
test suite for instance, AWS s3). 

 - To check the object store itself before APPENDs. It does not prevent data 
races (as the blob might be inserted by another thread between
the check and the append). Also previous experiments have shown it has a great 
cost on performance. CF https://github.com/linagora/james-project/pull/2011

 - To disable deduplication. This allows to effectively delete data (not yet 
the case for deduplication) but will result in data duplication and 
thus a storage cost increase.


 h3. Interaction with other features

An effort regarding implementing deduplication is currently on pause. BlobId 
will ship a notion of generation, we never delete blobs of the current 
generation
(and of a buffer generation to prevent clock skew), so we never delete blobs 
that we are appending, so by design the de-duplicated content removal will not 
interfere with this blobIdList mechanism.

This blobId list mechanism is scoped to the default bucket not to interfere 
with the DeletedMessageVault.



> Errors running deduplicated BlobStore on top of OVH (Swift) S3 APIs
> -------------------------------------------------------------------
>
>                 Key: JAMES-3570
>                 URL: https://issues.apache.org/jira/browse/JAMES-3570
>             Project: James Server
>          Issue Type: Bug
>          Components: Blob
>    Affects Versions: 3.6.0
>            Reporter: Benoit Tellier
>            Priority: Major
>              Labels: bug
>
> h3. What 
> At Linagora, we deploy our James offer at OVH, which provides a Swift backed 
> S3 API.
> Running the deduplicating blob store on such a setup, we encountered the 
> following errors:
>  - 400 errors with no messages upon PUT
>  - NoSuckKeyExceptions being thrown by the S3 driver
> We noticed these exceptions to be occasionally thrown by the post processing 
> (event bus), for emails with several recipients. Post processing eventually 
> succeeds.
> In addition to unpleasant errors in the log, we wonder if two concurrent 
> APPENDs could not lead to possible data corruption...
> h3. Why
> Swift do not implement strong read/write isolation. We did run the James 
> BlobStore test suite against OVH S3 APIs, and encountered the same errors.
> We thus link these errors to a default of isolation.
> h3. How to fix
> Our strategy so far is to implement 
> https://github.com/apache/james-project/blob/master/src/adr/0015-objectstorage-blobid-list.md,
>  but as part of a Linagora product built on top of James (because the 
> following ADR had been controversial and rejected by a part of the community).
> The idea behind this fix is to leverage Cassandra to avoid, most of the time, 
> to store a blob twice.
> We are not against contributing this mitigation under Apache, but we wish to 
> fix it timely on our side first. Please show interest if we want this to be 
> shared.
> Amongst the alternatives we can consider:
>  - To run on an alternative object storage (Zenko cloud server, used by our 
> test suite for instance, AWS s3). 
>  - To check the object store itself before APPENDs. It does not prevent data 
> races (as the blob might be inserted by another thread between
> the check and the append). Also previous experiments have shown it has a 
> great cost on performance. CF 
> https://github.com/linagora/james-project/pull/2011
>  - To disable deduplication. This allows to effectively delete data (not yet 
> the case for deduplication) but will result in data duplication and 
> thus a storage cost increase. Can be done today via a configuration option...
>  h3. Interaction with other features
> An effort regarding implementing deduplication is currently on pause. BlobId 
> will ship a notion of generation, we never delete blobs of the current 
> generation
> (and of a buffer generation to prevent clock skew), so we never delete blobs 
> that we are appending, so by design the de-duplicated content removal will 
> not 
> interfere with this blobIdList mechanism.
> This blobId list mechanism is scoped to the default bucket not to interfere 
> with the DeletedMessageVault.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to