Tellier Benoit created JAMES-2390:
-------------------------------------

             Summary: JMAP attachment performance issues
                 Key: JAMES-2390
                 URL: https://issues.apache.org/jira/browse/JAMES-2390
             Project: James Server
          Issue Type: New Feature
          Components: cassandra, JMAP
    Affects Versions: master
            Reporter: Tellier Benoit
            Assignee: Antoine Duprat


Most of the Cassandra failures are related to attachment downloads, and more 
precisely to attachment right checking.

Having a look at attached screenshots:
 - We can notice a lot of warnings are generated by JMAP attachment downloads.
 - That failure happens when reading meta-data, in order to retrieve the list 
of referencing messages to resolve rights.
 - Furthermore, we can notice failure is systematic for some attachments.

I spend a bit of time this weekend analysing this (unexpected!) performance 
issues. I've mostly found 2 intuitive performance improvements as well as one 
more complex.

 -1. Upon checking whether a set of messages is accessible, the containing 
mailbox rights were checks on a per-mailbox base. This is sub-optimal as some 
messages might be in the same mailbox, whose rights will be needlessly checked 
several times.

This change inserts smoothly into the codebase, the tools for checking rights 
once per mailbox is already implemented. Just not used in that case.

 - 2. Paging and asynchronous code don't combine well as already proven by 
previous code. The mantra is *join then collect*. If the operation is done 
reverse and entries exceed paging size (~5000) an exception will be thrown by 
the Cassandra driver.

This explains the systematic failures for some specific attachments... The fix 
is trivial, and I added a test for demonstrating this.

 - 3. The given logs suggest that we have high cardinality rows in our database 
(IE an attachment referenced by several messages), as the number of referencing 
messages exceeds 5000 (to trigger paging issues)

Such a high cardinality has a massive read cost:
 - Reading such a row is a complex operation
 - Caching can not help as cache size per primary key is exceeded
 - Rights would be resolved for each referencing messages, generating an 
expensive read Cascade.

Note that deduplication is done at the Attachment level. By looking at the 
attachment names (cf screenshots) we can notice these "high cardinality" 
attachments look like inlined images in signature...

The stand here is that deduplicating is not a concern for attachments, but for 
blobs. We should further push this lower level constraint in the stack. That 
way, each blob would be deduplicated (storage cost reduction, higher FS cache 
efficiency, etc...) while avoiding *wide rows*.

We should ensure each newly generated AttachmentId is unique, then generate 
BlobId from the blob's content, to avoid wide rows while keeping deduplication 
in place.

Note that this being done just for newly received messages, this can be done 
transparently, without the needs for a migration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to