[
https://issues.apache.org/jira/browse/JAMES-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benoit Tellier updated JAMES-3576:
----------------------------------
Attachment: (was: after_gatling.png)
> Further denormalize Message entity?
> -----------------------------------
>
> Key: JAMES-3576
> URL: https://issues.apache.org/jira/browse/JAMES-3576
> Project: James Server
> Issue Type: Improvement
> Components: IMAPServer, JMAP
> Affects Versions: 3.6.0
> Reporter: Benoit Tellier
> Assignee: Antoine Duprat
> Priority: Major
> Labels: perf
> Fix For: 3.7.0
>
> Attachments: before_gatling.png, imap-reorg.png, jmap-reorg.png
>
>
> h3. The facts
> Here is our message structure:
> {code:java}
> cqlsh:apache_james> DESCRIBE TABLE imapuidtable ;
> CREATE TABLE apache_james.imapuidtable (
> messageid timeuuid,
> mailboxid timeuuid,
> uid bigint,
> flaganswered boolean,
> flagdeleted boolean,
> flagdraft boolean,
> flagflagged boolean,
> flagrecent boolean,
> flagseen boolean,
> flaguser boolean,
> modseq bigint,
> userflags set<text>,
> PRIMARY KEY (messageid, mailboxid, uid)
> ) WITH comment = 'Holds mailbox and flags for each message, lookup by message
> ID';
> cqlsh:apache_james> DESCRIBE TABLE messageidtable ;
> CREATE TABLE apache_james.messageidtable (
> mailboxid timeuuid,
> uid bigint,
> flaganswered boolean,
> flagdeleted boolean,
> flagdraft boolean,
> flagflagged boolean,
> flagrecent boolean,
> flagseen boolean,
> flaguser boolean,
> messageid timeuuid,
> modseq bigint,
> userflags set<text>,
> PRIMARY KEY (mailboxid, uid)
> ) WITH comment = 'Holds mailbox and flags for each message, lookup by mailbox
> ID + UID';
> cqlsh:apache_james> DESCRIBE TABLE messagev3 ;
> CREATE TABLE apache_james.messagev3 (
> messageid timeuuid PRIMARY KEY,
> bodycontent text,
> bodyoctets bigint,
> bodystartoctet int,
> attachments list<frozen<attachments>>,
> // and also message properties
> ) WITH comment = 'Holds message metadata, independently of any mailboxes.
> Content of messages is stored in `blobs` and `blobparts` tables. Optimizes
> property storage compared to V2.';
> {code}
> Some very common patterns is to access messages headers.
> - imap-reorg.png (attached) shows me opening my IMAP mailbox after a long
> weekend. We can see that my MUA lists headers of the 108 messages received in
> the time laps. We can see that, in order to retrieve the storage
> informations, the messagev3 table needs to be accessed for each message,
> generating a huge count of PRIMARY KEY reads that are not strictly necessary,
> and reading messageV3 yields second place in query time occupation.
> - Similar things happens on top of JMAP. jmap-reorg.png shows 2 webmail
> email list loads. Same things: For each message entry, we need to query
> messagev3 to retrieve storage informations and being able to retireve
> headers. Here messagev3 reads yields first place, before the message metadata
> reads, before the header reads.
> h3. The bit of Cassandra philosophy we might have missed...
> https://www.datastax.com/blog/basic-rules-cassandra-data-modeling
> {code:java}
> # Non-Goals
> ## Minimize the Number of Writes
> Writes in Cassandra aren't free, but they're awfully cheap. Cassandra is
> optimized for high write throughput, and almost all writes are equally
> efficient [1].
> ## Minimize Data Duplication
> Denormalization and duplication of data is a fact of life with Cassandra.
> Don't be afraid of it. [...] In order to get the most efficient reads, you
> often need to duplicate data.
> # Basic goals
> [...]
> ## Rule 2: Minimize the Number of Partitions Read
> [...] Furthermore, even on a single node, it's more expensive to read from
> multiple partitions than from a single one due to the way rows are stored.
> {code}
> https://thelastpickle.com/blog/2017/03/16/compaction-nuance.html
> {code:java}
> An incorrect data model can turn a single query into hundreds of queries,
> resulting in increased latency, decreased throughput, and missed SLAs.
> {code}
> (This one is of an article about compaction but my feeling is that it is very
> relevant to the situation I describe, so I could not refrain from quoting
> it...)
> h3. The new data-model
> I propose to do the following:
> {code:java}
> cqlsh:apache_james> ALTER TABLE messageIdTable ADD internalDate timestamp ;
> cqlsh:apache_james> ALTER TABLE messageIdTable ADD bodyStartOctet bigint ;
> cqlsh:apache_james> ALTER TABLE messageIdTable ADD fullContentOctets bigint ;
> cqlsh:apache_james> ALTER TABLE messageIdTable ADD headerContent text ;
> cqlsh:apache_james> ALTER TABLE imapUidTable ADD internalDate timestamp ;
> cqlsh:apache_james> ALTER TABLE imapUidTable ADD bodyStartOctet bigint ;
> cqlsh:apache_james> ALTER TABLE imapUidTable ADD fullContentOctets bigint ;
> cqlsh:apache_james> ALTER TABLE imapUidTable ADD headerContent text ;
> {code}
> That way we can easily resolve METADATA and HEADERS FetchGroups against both
> messageIdTable and imapUidTable, effectively limiting messageV3 reads to the
> FULL body reads.
> h3. Expectations
> This will effectively reduce the Cassandra query load for both JMAP and IMAP,
> effectively speeding up James and allowing us to scale to larger workloads
> given the exact same infrastructure. A boost ranging from 25% to 33% is
> expected for both IMAP, JMAP and POP3 workloads.
> h3. Migration strategy
> - 1. The admin ALTER the tables
> - 2. The admin deploys the new version of James. New written data is then
> fully denormalized...
> - 3. But old written data still needs reads to messagev3 to be served (if
> expected data is not in messageIdTable or in imapUidTable we know we need to
> read it from messagev3 table).
> - 4. We propose a migration task that effectively look up messagev3 to
> populate newly created rows for messageIdTable and imapUidTable - this way an
> admin can ensure to fully benefit from the enhancement given previously
> existing data.
> I think the classical migration strategy is not a good fit for this one as:
> - fallback mechanisms incurs performance degradations (double the amount of
> reads in the transition period) and message metadata query speed is critical.
> With the proposed strategy during the transition period at worst the previous
> behavior is applied.
> - Creating and deleting tables is messy, when simple in-place modification
> do not generate data model gardbage.
> - We can add a startup-check to ensure the rows are correctly here (and
> abort startup if not)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]