[jira] [Commented] (JAMES-3435) Relaxing LWT usage: domain, users

ASF GitHub Bot (Jira) Tue, 01 Dec 2020 01:05:05 -0800


    [ 
https://issues.apache.org/jira/browse/JAMES-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241365#comment-17241365
 ]


ASF GitHub Bot commented on JAMES-3435:
---------------------------------------

chibenwa edited a comment on pull request #255:
URL: https://github.com/apache/james-project/pull/255#issuecomment-736297452


   For one of our upcoming deployments, we are performing a load-testing 
campaign against a testing infrastructure. This load testing campain aims at 
finding the limits of the aforementioned platform.
   
   We successfully succeeded to load James JMAP endpoint to a breakpoint at 
5400 users (isolation).
   
   Above that number, evidence suggest that we are CPU bound (requests )
   
   On a Cassandra standpoints, there is a high CPU usage (load of 10) that we 
linked to the usage of lightweight transactions / paxos usage, for ACLs [1] [2] 
[3] [4]. Detailed analysis is on the references.
   
   This is a topic I'm arguing for months [5], we need to take a strong 
decision, and enforce it.
   
   Infrastructure:
    - 3x Cassandra nodes (8 cores, 32 GB RAM, 200 GB SSD)
    - 4x James server (4 cores, 8 GB RAM)
    - ElasticSearch servers: not measured.
   
   # Action to conduct
   
    - Perform a test run with ACL paxos turned off.
     -> This aims at confirming the deletarious impact of their usage
     -> Benoit & René are responsible to deploy and test a modified instance of 
James on PRE-PROD, with ACL turned off
     -> Benoit will continue lobbying AGAINST the usage of strong consistency 
in the community [5], which is overall a Cassandra bad practice and a mis-fit.
     -> If conclusive, Benoit will present a data-race proofed ACL 
implementation on top of Cassandra leveraging CRDT and eventual consistency.
   
    - Perform a run with more James CPU (4 * 6 cpus?) (René & Benoit)
     -> The goal is to see if we are James CPU bound or Cassandra CPU bound
   
   # Runs details
   
   
![4000-stats](https://user-images.githubusercontent.com/6928740/100713233-8dd56800-33e6-11eb-8c23-2dbe90436ab9.png)
   
   
![4000-latency](https://user-images.githubusercontent.com/6928740/100713229-8c0ba480-33e6-11eb-8328-a29252ca1a1e.png)
   
   [6] [7] shows a (successfull!) run of JMAP scenario alone on top of James.
   
   
![6000-stats](https://user-images.githubusercontent.com/6928740/100713248-97f76680-33e6-11eb-831d-8a50539a6844.png)
   
   
![6000-latency](https://user-images.githubusercontent.com/6928740/100713264-9d54b100-33e6-11eb-85b7-ed64c6e40459.png)
   
   [8] [9] shows a run hitting a throughtput limit point (5400 simultaneous 
users, 320 req/s) from which the performance highly downgrades. This is the 
system breaking point.
   
   # References
   
   [1] https://blog.pythian.com/lightweight-transactions-cassandra/ documents 
the CPU / memory / bandwith impact of using LWT.
   
   
[dstat-cassandra.txt](https://github.com/apache/james-project/files/5621066/dstat-cassandra.txt)
   
   [2] dstat-cassandra.txt highlights a CPU over-usage on Cassandra node. This 
behavior is NOT NORMAL. Read-heavy workload are not supposed to be CPU-bound.
   
   
[cassandra-tablestats.txt](https://github.com/apache/james-project/files/5621067/cassandra-tablestats.txt)
   
   [3] cassandra-tablestats.txt shouws table usage. We can notice BY FAR that 
our most used table is the system.paxos table.
   
   
[compaction-history.txt](https://github.com/apache/james-project/files/5621070/compaction-history.txt)
   
   [4] compaction-history.txt highlights how often we do compact the paxos 
system table in comparison to other tables further higlighting this to be a 
hot-spot.
   
   
   [5] Benoit proposition to review lightweight transaction / paxos usage in 
James: https://github.com/apache/james-project/pull/255
   
   [6] 4000-stats.png shows good statistics of a run with 4000 users
   [7] 4000-latency.png shows latency evolution in regard to the number of 
users with 4000 users
   [8] 6000-stats.png shows good statistics of a run with 6000 users
   [9] 6000-latency.png shows latency evolution in regard to the number of 
users with 6000 users. Preformance breackage can be seen at 5400 users.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Relaxing LWT usage: domain, users
> ---------------------------------
>
>                 Key: JAMES-3435
>                 URL: https://issues.apache.org/jira/browse/JAMES-3435
>             Project: James Server
>          Issue Type: Improvement
>          Components: cassandra
>    Affects Versions: master
>            Reporter: Benoit Tellier
>            Priority: Major
>             Fix For: master
>
>
> https://www.mail-archive.com/server-dev@james.apache.org/msg68713.html
> {code:java}
> Cassandra is an eventually consistent datastore, that can be used in a
> consistant fashion. To do so, we rely on a mechanism called "LightWeight
> Transactions (LWT)". Lightweight transactions relies on the PAXOS
> distributed consensus algorithm to enforce a condition upon data
> mutation. A table, system.paxos, is used to track the state of
> transactions. Furthermore, upon writes, several round-trips (two) are
> needed to ensure data integrity accross replica(minimum round trips to
> achieve consensus) and the system.paxos table is read / written to in
> addition to the applicative table.
> All of this causes LWT to be significantly slower than their lower
> consistency counterparts. On some Linagora owned production instances,
> regular reads takes 2ms while reads on tables relying on LWT takes 6ms.
> Similar figures are found for writes. We also noticed some high
> compaction throughtput on the paxos table, leading to many back-ground
> writes.
> Given the massive impact of LWT usage on performance, and given the lack
> of debate upon LWT adoption, I would like to re-challenge their usage...
> Here are the places we rely on LWT for the Distributed Server:
>  - IMAP UID generation (monotic integer) - strong consistency is
> strictly required to not loose data as overwriting a uid means
> overwriting a message.
>  - IMAP ModSeq generation (monotic integer) - strong consistency is
> required, as modseq overwrites can lead to some data not being well
> synchronised.
>  - Domain and users - we rely on LWT to return an error when deleting a
> user that do not exist, or creating an already existing user. It sounds
> unecessary.
>  - Message flags relies on LWT to ensure updates are not overwritten. As
> an often read metadata, the impact is high, for limited criticity for
> the end user. After all, no data is lost, only a user action like
> marking a message as Seen, an action that he can very well perform again
>  - Mailbox path registration, ACL - required to prevent data races
> My proposal would be:
>  - Keep using LWT for UID and modseq generation, as well as Mailbox path
> registration.
>  - Make the use of LWT for message flags update configurable - as an
> admin I can choose to disable it.
>  - I am also fine with completly removing LWT usage for message flags
> update.
>  - No longer use LWT on domain or users. Instead use idempotent create /
> delete. The contract test will thus need to be relaxed.
>  - On the long term, relying on a CRDT to represent ACLs at the
> Cassandra level, instead of serialized JSON, would enable to get rid of
> LWT usage on the ACL table.
> {code}
> Let's start relaxing LWT transaction for users & domains.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

[jira] [Commented] (JAMES-3435) Relaxing LWT usage: domain, users

Reply via email to