[ https://issues.apache.org/jira/browse/JAMES-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benoit Tellier closed JAMES-3660. --------------------------------- Fix Version/s: 3.7.0 Resolution: Fixed > Cassandra mailbox creation unstable when high concurency > -------------------------------------------------------- > > Key: JAMES-3660 > URL: https://issues.apache.org/jira/browse/JAMES-3660 > Project: James Server > Issue Type: Improvement > Reporter: Benoit Tellier > Priority: Major > Fix For: 3.7.0 > > Time Spent: 20m > Remaining Estimate: 0h > > org.apache.james.mailbox.cassandra.CassandraMailboxManagerTest$WithBatchSize.creatingConcurrentlyMailboxesWithSameParentShouldNotFail > tests is enough to trigger instability on the Apache CI > https://ci-builds.apache.org/job/james/job/ApacheJames/job/PR-685/1/ > {code:java} > Error Message > java.lang.RuntimeException: > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency SERIAL (1 responses were required but only 0 > replica responded) > Stacktrace > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency SERIAL (1 responses were required but only 0 > replica responded) > Caused by: java.lang.RuntimeException: > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency SERIAL (1 responses were required but only 0 > replica responded) > Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: > Cassandra timeout during read query at consistency SERIAL (1 responses were > required but only 0 replica responded) > Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: > Cassandra timeout during read query at consistency SERIAL (1 responses were > required but only 0 replica responded) > Standard Output > 11:29:54.751 [ERROR] o.a.j.u.c.ConcurrentTestRunner - Error caught during > concurrent testing (iteration 0, threadNumber 1) > com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout > during read query at consistency SERIAL (1 responses were required but only 0 > replica responded) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:90) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:65) > at > com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:297) > at > com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:268) > at > com.datastax.shaded.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > ... 25 common frames omitted > Wrapped by: com.datastax.driver.core.exceptions.ReadTimeoutException: > Cassandra timeout during read query at consistency SERIAL (1 responses were > required but only 0 replica responded) > {code} > In short, the LWT usage is enough to create contention. > Looking closer at the issue, StoreMailboxManager does numerous defensive > SERIAL reads (doing empty paxos commits) which ends up further degrading > performance and increase contention. > I believe removing these defensive reads would make our code more stable. > It resulted in faster (x2) test for > gConcurrentlyMailboxesWithSameParentShouldNotFail > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org