[
https://issues.apache.org/jira/browse/JAMES-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272885#comment-17272885
]
Raphael Ouazana commented on JAMES-3495:
----------------------------------------
would we need also a repare task? or maybe an auto-repare?
> MessageId field in messageIdTable can be null
> ---------------------------------------------
>
> Key: JAMES-3495
> URL: https://issues.apache.org/jira/browse/JAMES-3495
> Project: James Server
> Issue Type: New Feature
> Components: cassandra, mailbox
> Affects Versions: 3.6.0
> Reporter: Benoit Tellier
> Priority: Major
>
> I observed some data corruption on one of our production instances, where the
> user was having one message in its messageIdTable with a null messageId. The
> imact was a persiting IMAP FETCH error denying the user the access to his
> mailbox (NPE).
> By that time I did not understand the origin of it, and thought it was due to
> a hard shutdown.
> After listening to https://www.youtube.com/watch?v=86olupkuLlU while
> performing some post running stretches, I realized that Discord encountered
> similar issues while migrating from MongoDB to Cassandra.
> They encountered the very same symptom that were caused by out of order
> updates. Something like:
> {code:java}
> CassandraId mailboxId = CassandraId.timeBased();
> MessageUid messageUid = MessageUid.of(1);
> CassandraMessageId messageId = messageIdFactory.generate();
> testee.insert(ComposedMessageIdWithMetaData.builder()
> .composedMessageId(new ComposedMessageId(mailboxId,
> messageId, messageUid))
> .flags(new Flags())
> .modSeq(ModSeq.of(1))
> .build())
> .block();
> testee.delete(mailboxId, messageUid).block();
> testee.updateMetadata(ComposedMessageIdWithMetaData.builder()
> .composedMessageId(new ComposedMessageId(mailboxId,
> messageId, messageUid))
> .flags(new
> Flags(org.apache.james.mailbox.cassandra.table.Flag.ANSWERED))
> .modSeq(ModSeq.of(2))
> .build())
> .block();
> Optional<ComposedMessageIdWithMetaData> message =
> testee.retrieve(mailboxId, messageUid).block();
> System.out.println(message);
> assertThat(message.isPresent()).isFalse();
> {code}
> Would print:
> {code:java}
> Optional[ComposedMessageIdWithMetaData{composedMessageId=ComposedMessageId{mailboxId=CassandraId{id=2662a6e0-60a2-11eb-89ca-8540fda15edb},
> messageId=CassandraMessageId{uuid=null}, uid=MessageUid{uid=1}},
> flags=flagAnswered, modSeq=ModSeq{value=2}}]
> {code}
> How to solve this?
> - Either ignore the delete. We should thus specify the messageId on each
> updates so that it cannot end up being null. This ends up updating to much
> data and likely have a performance impact.
> - Filter out entries with a null messageId as we know that for sure they
> were deleted. That is my personal preference.
> We likely need to audit other tables were partial updates are performed, and
> could result in similar issues.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]