[jira] [Updated] (JAMES-3495) MessageId field in messageIdTable can be null

Benoit Tellier (Jira) Wed, 27 Jan 2021 05:23:10 -0800


     [ 
https://issues.apache.org/jira/browse/JAMES-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Benoit Tellier updated JAMES-3495:
----------------------------------
    Description: 
I observed some data corruption on one of our production instances, where the 
user was having one message in its messageIdTable with a null messageId. The 
imact was a persiting IMAP FETCH error denying the user the access to his 
mailbox (NPE).

By that time I did not understand the origin of it, and thought it was due to a 
hard shutdown.

After listening to https://www.youtube.com/watch?v=86olupkuLlU while performing 
some post running stretches, I realized that Discord encountered similar issues 
while migrating from MongoDB to Cassandra.

They encountered the very same symptom that were caused by out of order 
updates. Something like:


{code:java}
        CassandraId mailboxId = CassandraId.timeBased();
        MessageUid messageUid = MessageUid.of(1);
        CassandraMessageId messageId = messageIdFactory.generate();
        testee.insert(ComposedMessageIdWithMetaData.builder()
                .composedMessageId(new ComposedMessageId(mailboxId, messageId, 
messageUid))
                .flags(new Flags())
                .modSeq(ModSeq.of(1))
                .build())
            .block();

        testee.delete(mailboxId, messageUid).block();

        testee.updateMetadata(ComposedMessageIdWithMetaData.builder()
                .composedMessageId(new ComposedMessageId(mailboxId, messageId, 
messageUid))
                .flags(new 
Flags(org.apache.james.mailbox.cassandra.table.Flag.ANSWERED))
                .modSeq(ModSeq.of(2))
                .build())
            .block();

        Optional<ComposedMessageIdWithMetaData> message = 
testee.retrieve(mailboxId, messageUid).block();
        System.out.println(message);
        assertThat(message.isPresent()).isFalse();
{code}

Would print:

{code:java}
Optional[ComposedMessageIdWithMetaData{composedMessageId=ComposedMessageId{mailboxId=CassandraId{id=2662a6e0-60a2-11eb-89ca-8540fda15edb},
 messageId=CassandraMessageId{uuid=null}, uid=MessageUid{uid=1}}, 
flags=flagAnswered, modSeq=ModSeq{value=2}}]
{code}

How to solve this?

 - Either ignore the delete. We should thus specify the messageId on each 
updates so that it cannot end up being null. This ends up updating to much data 
and likely have a performance impact.
 - Filter out entries with a null messageId as we know that for sure they were 
deleted. That is my personal preference.

We likely need to audit other tables were partial updates are performed, and 
could result in similar issues.


  was:
I observed some data corruption on one of our production instances, where the 
user was having one message in its messageIdTable with a null messageId.

By that time I did not understand the origin of it, and thought it was due to a 
hard shutdown.

After listening to https://www.youtube.com/watch?v=86olupkuLlU while performing 
some post running stretches, I realized that Discord encountered similar issues 
while migrating from MongoDB to Cassandra.

They encountered the very same symptom that were caused by out of order 
updates. Something like:


{code:java}
        CassandraId mailboxId = CassandraId.timeBased();
        MessageUid messageUid = MessageUid.of(1);
        CassandraMessageId messageId = messageIdFactory.generate();
        testee.insert(ComposedMessageIdWithMetaData.builder()
                .composedMessageId(new ComposedMessageId(mailboxId, messageId, 
messageUid))
                .flags(new Flags())
                .modSeq(ModSeq.of(1))
                .build())
            .block();

        testee.delete(mailboxId, messageUid).block();

        testee.updateMetadata(ComposedMessageIdWithMetaData.builder()
                .composedMessageId(new ComposedMessageId(mailboxId, messageId, 
messageUid))
                .flags(new 
Flags(org.apache.james.mailbox.cassandra.table.Flag.ANSWERED))
                .modSeq(ModSeq.of(2))
                .build())
            .block();

        Optional<ComposedMessageIdWithMetaData> message = 
testee.retrieve(mailboxId, messageUid).block();
        System.out.println(message);
        assertThat(message.isPresent()).isFalse();
{code}

Would print:

{code:java}
Optional[ComposedMessageIdWithMetaData{composedMessageId=ComposedMessageId{mailboxId=CassandraId{id=2662a6e0-60a2-11eb-89ca-8540fda15edb},
 messageId=CassandraMessageId{uuid=null}, uid=MessageUid{uid=1}}, 
flags=flagAnswered, modSeq=ModSeq{value=2}}]
{code}

How to solve this?

 - Either ignore the delete. We should thus specify the messageId on each 
updates so that it cannot end up being null. This ends up updating to much data 
and likely have a performance impact.
 - Filter out entries with a null messageId as we know that for sure they were 
deleted. That is my personal preference.

We likely need to audit other tables were partial updates are performed, and 
could result in similar issues.



> MessageId field in messageIdTable can be null
> ---------------------------------------------
>
>                 Key: JAMES-3495
>                 URL: https://issues.apache.org/jira/browse/JAMES-3495
>             Project: James Server
>          Issue Type: New Feature
>          Components: cassandra, mailbox
>    Affects Versions: 3.6.0
>            Reporter: Benoit Tellier
>            Priority: Major
>
> I observed some data corruption on one of our production instances, where the 
> user was having one message in its messageIdTable with a null messageId. The 
> imact was a persiting IMAP FETCH error denying the user the access to his 
> mailbox (NPE).
> By that time I did not understand the origin of it, and thought it was due to 
> a hard shutdown.
> After listening to https://www.youtube.com/watch?v=86olupkuLlU while 
> performing some post running stretches, I realized that Discord encountered 
> similar issues while migrating from MongoDB to Cassandra.
> They encountered the very same symptom that were caused by out of order 
> updates. Something like:
> {code:java}
>         CassandraId mailboxId = CassandraId.timeBased();
>         MessageUid messageUid = MessageUid.of(1);
>         CassandraMessageId messageId = messageIdFactory.generate();
>         testee.insert(ComposedMessageIdWithMetaData.builder()
>                 .composedMessageId(new ComposedMessageId(mailboxId, 
> messageId, messageUid))
>                 .flags(new Flags())
>                 .modSeq(ModSeq.of(1))
>                 .build())
>             .block();
>         testee.delete(mailboxId, messageUid).block();
>         testee.updateMetadata(ComposedMessageIdWithMetaData.builder()
>                 .composedMessageId(new ComposedMessageId(mailboxId, 
> messageId, messageUid))
>                 .flags(new 
> Flags(org.apache.james.mailbox.cassandra.table.Flag.ANSWERED))
>                 .modSeq(ModSeq.of(2))
>                 .build())
>             .block();
>         Optional<ComposedMessageIdWithMetaData> message = 
> testee.retrieve(mailboxId, messageUid).block();
>         System.out.println(message);
>         assertThat(message.isPresent()).isFalse();
> {code}
> Would print:
> {code:java}
> Optional[ComposedMessageIdWithMetaData{composedMessageId=ComposedMessageId{mailboxId=CassandraId{id=2662a6e0-60a2-11eb-89ca-8540fda15edb},
>  messageId=CassandraMessageId{uuid=null}, uid=MessageUid{uid=1}}, 
> flags=flagAnswered, modSeq=ModSeq{value=2}}]
> {code}
> How to solve this?
>  - Either ignore the delete. We should thus specify the messageId on each 
> updates so that it cannot end up being null. This ends up updating to much 
> data and likely have a performance impact.
>  - Filter out entries with a null messageId as we know that for sure they 
> were deleted. That is my personal preference.
> We likely need to audit other tables were partial updates are performed, and 
> could result in similar issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (JAMES-3495) MessageId field in messageIdTable can be null

Reply via email to