Hello there,

As I did state it in https://issues.apache.org/jira/browse/JAMES-3591

Cassandra is not made for large binaries storage. And deliver
sub-optimal performances compared to ObjectStorage alternatives (like
S3, MinIO or Apache Ozone).

We need to ensure users are fully aware of the consequences while
choosing this option.

Thus we should add warnings in:

 - The code via java doc
 - The documentation websites
 - dockerhub README
 - A log upon startup.
 - Sample configuration file.


I did have exchanges with Nate Mc Call (Apache Cassandra PMC) on this topic:

```gitter
Hi folks - would really like to talk to anyone that worked on the
Cassandra Blob Store implementation about potentially pulling this out
for general use. Please ping on zzn...@apache.org or zznate on asf's slack.
```

Then exchanging by email:

```private email
Hello Nate,

Thank you very much for raising this topic.

I am seriously concerned with the performance and storage costs of the
Cassandra blob store for quite some time already.

The Apache James PMC had been reluctant to remove it as we were worry
bringing additional runtime dependencies to the project (meaning forcing
users to rely on an object store like Ozone or MinIO).

I personnaly encourage any move on this topic to deprecate/provide
extensive warnings regarding its use and am very curious to know what
you have to say about it.

Best regards,
```

Answered by:

```private email
Hi Benoit,
Thanks for the response. At a high level, I completely agree with you -
a database of any sort is not the right place for binary content. That
said, I regularly see cases where folks are in a situation like "this is
what we have provisioned and accounted for, let's just use it."

As it stands, this is one of the better binary storage approaches which
I have seen implemented. A checksumming, reactive API with a
configurable chunk size solves a lot of problems for people.

At the end of the day though, I do very much agree that the right answer
is to use a distributed filesystem of some sort (Ozone and MinIO would
definitely be better), and folks should be warned about the substantial
storage and performance overhead of doing it in C*. But this approach at
least will "suck less" than many others I have seen using C* similarly.

Thanks again for the response, and nice to meet you either way.

Cheers,
-Nate
```

I did open a PR for enacting it:
https://github.com/apache/james-project/pull/450

Cheers,

Benoit


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to