Hi all,

We would like to inform you that we will focus their efforts on these 2 topics on the next coming weeks :

* James administration procedures (Ops friendly)
* Solve Cassandra inconsistencies on the Mailbox object

## James administration procedures (Ops friendly)

Scope target: guice-distributed James product

The goal here is to put efforts on writing a documentation on administration procedures that an ops or admin system can follow to solve a specific issue when encountered. We decided to keep this list of items for the moment :

* Checking James health
* Mail processing
* Event Bus
* ES indexing
* Updating Cassandra schema version
* Solving Cassandra inconsistencies
* Cassandra table level parameters and why
* Mail queue administration
* DeletedMessageVault

We hope this could be helpful for the community in general.


## Solve Cassandra inconsistencies on the Mailbox object

`mailboxPath` and `mailbox` tables have a tight relation, where we usually first check `mailboxPath` to find the Mailbox ID given the mailbox name or path, to then get the Mailbox information from the `mailbox` table.

Such a relation exists because Cassandra, as a NoSQL DB, does not support transactions. So inconsistencies could be observed for instance if some writes fail due to Cassandra performance issues. We could observe some of those inconsistencies on the Mailbox object on Cassandra, with James failing to find mailboxes that are referenced in `mailboxPath` table but not existing in `mailbox` table.

For example, when creating a new mailbox, we first create the entry in mailboxPath, then in mailbox. If the second operation fails, we have the above mentionned inconsistency. To try to reduce that, we decided to :

* Add more unit tests to have a better understanding about potential inconsistencies cases with Mailbox * Separate rename and create mailbox logic (it's shared for now), as we do an extra read on Cassandra to delete the previous mailboxPath in case we rename the mailbox. Separating the logic would allow us to avoid that read that is unecessary in the create case, and would make the code more robust against inconsistencies. * Retry the 'create mailbox' step (so the one after mailboxPath creation) if it fails.
* Expose a webadmin task to solve mailbox inconsistencies

We do realize as well that inconsistencies can occur on other objects and that we need to think in the future of a more systematic approach to address this problem.


If you have any comments or feedbacks to this, please don't hesitate to answer us.

Best regards,
Rene Cordier.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to