[ 
https://issues.apache.org/jira/browse/JAMES-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272885#comment-17272885
 ] 

Raphael Ouazana commented on JAMES-3495:
----------------------------------------

would we need also a repare task? or maybe an auto-repare?

> MessageId field in messageIdTable can be null
> ---------------------------------------------
>
>                 Key: JAMES-3495
>                 URL: https://issues.apache.org/jira/browse/JAMES-3495
>             Project: James Server
>          Issue Type: New Feature
>          Components: cassandra, mailbox
>    Affects Versions: 3.6.0
>            Reporter: Benoit Tellier
>            Priority: Major
>
> I observed some data corruption on one of our production instances, where the 
> user was having one message in its messageIdTable with a null messageId. The 
> imact was a persiting IMAP FETCH error denying the user the access to his 
> mailbox (NPE).
> By that time I did not understand the origin of it, and thought it was due to 
> a hard shutdown.
> After listening to https://www.youtube.com/watch?v=86olupkuLlU while 
> performing some post running stretches, I realized that Discord encountered 
> similar issues while migrating from MongoDB to Cassandra.
> They encountered the very same symptom that were caused by out of order 
> updates. Something like:
> {code:java}
>         CassandraId mailboxId = CassandraId.timeBased();
>         MessageUid messageUid = MessageUid.of(1);
>         CassandraMessageId messageId = messageIdFactory.generate();
>         testee.insert(ComposedMessageIdWithMetaData.builder()
>                 .composedMessageId(new ComposedMessageId(mailboxId, 
> messageId, messageUid))
>                 .flags(new Flags())
>                 .modSeq(ModSeq.of(1))
>                 .build())
>             .block();
>         testee.delete(mailboxId, messageUid).block();
>         testee.updateMetadata(ComposedMessageIdWithMetaData.builder()
>                 .composedMessageId(new ComposedMessageId(mailboxId, 
> messageId, messageUid))
>                 .flags(new 
> Flags(org.apache.james.mailbox.cassandra.table.Flag.ANSWERED))
>                 .modSeq(ModSeq.of(2))
>                 .build())
>             .block();
>         Optional<ComposedMessageIdWithMetaData> message = 
> testee.retrieve(mailboxId, messageUid).block();
>         System.out.println(message);
>         assertThat(message.isPresent()).isFalse();
> {code}
> Would print:
> {code:java}
> Optional[ComposedMessageIdWithMetaData{composedMessageId=ComposedMessageId{mailboxId=CassandraId{id=2662a6e0-60a2-11eb-89ca-8540fda15edb},
>  messageId=CassandraMessageId{uuid=null}, uid=MessageUid{uid=1}}, 
> flags=flagAnswered, modSeq=ModSeq{value=2}}]
> {code}
> How to solve this?
>  - Either ignore the delete. We should thus specify the messageId on each 
> updates so that it cannot end up being null. This ends up updating to much 
> data and likely have a performance impact.
>  - Filter out entries with a null messageId as we know that for sure they 
> were deleted. That is my personal preference.
> We likely need to audit other tables were partial updates are performed, and 
> could result in similar issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to