[ 
https://issues.apache.org/jira/browse/JAMES-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tellier Benoit closed JAMES-2390.
---------------------------------

> JMAP attachment performance issues
> ----------------------------------
>
>                 Key: JAMES-2390
>                 URL: https://issues.apache.org/jira/browse/JAMES-2390
>             Project: James Server
>          Issue Type: New Feature
>          Components: cassandra, JMAP
>    Affects Versions: master
>            Reporter: Tellier Benoit
>            Assignee: Antoine Duprat
>            Priority: Major
>              Labels: perfomance
>         Attachments: Capture d’écran de 2018-05-06 19-32-31.png, Capture 
> d’écran de 2018-05-06 19-35-02.png
>
>
> Most of the Cassandra failures are related to attachment downloads, and more 
> precisely to attachment right checking.
> Having a look at attached screenshots:
>  - We can notice a lot of warnings are generated by JMAP attachment downloads.
>  - That failure happens when reading meta-data, in order to retrieve the list 
> of referencing messages to resolve rights.
>  - Furthermore, we can notice failure is systematic for some attachments.
> I spend a bit of time this weekend analysing this (unexpected!) performance 
> issues. I've mostly found 2 intuitive performance improvements as well as one 
> more complex.
>  -1. Upon checking whether a set of messages is accessible, the containing 
> mailbox rights were checks on a per-mailbox base. This is sub-optimal as some 
> messages might be in the same mailbox, whose rights will be needlessly 
> checked several times.
> This change inserts smoothly into the codebase, the tools for checking rights 
> once per mailbox is already implemented. Just not used in that case.
>  - 2. Paging and asynchronous code don't combine well as already proven by 
> previous code. The mantra is *join then collect*. If the operation is done 
> reverse and entries exceed paging size (~5000) an exception will be thrown by 
> the Cassandra driver.
> This explains the systematic failures for some specific attachments... The 
> fix is trivial, and I added a test for demonstrating this.
>  - 3. The given logs suggest that we have high cardinality rows in our 
> database (IE an attachment referenced by several messages), as the number of 
> referencing messages exceeds 5000 (to trigger paging issues)
> Such a high cardinality has a massive read cost:
>  - Reading such a row is a complex operation
>  - Caching can not help as cache size per primary key is exceeded
>  - Rights would be resolved for each referencing messages, generating an 
> expensive read Cascade.
> Note that deduplication is done at the Attachment level. By looking at the 
> attachment names (cf screenshots) we can notice these "high cardinality" 
> attachments look like inlined images in signature...
> The stand here is that deduplicating is not a concern for attachments, but 
> for blobs. We should further push this lower level constraint in the stack. 
> That way, each blob would be deduplicated (storage cost reduction, higher FS 
> cache efficiency, etc...) while avoiding *wide rows*.
> We should ensure each newly generated AttachmentId is unique, then generate 
> BlobId from the blob's content, to avoid wide rows while keeping 
> deduplication in place.
> Note that this being done just for newly received messages, this can be done 
> transparently, without the needs for a migration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to