heesung-sn commented on code in PR #20948:
URL: https://github.com/apache/pulsar/pull/20948#discussion_r1290391860
##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java:
##########
@@ -594,13 +595,18 @@ private void updateMessageMetadata(final MessageMetadata
msgMetadata, final int
}
}
- private long updateMessageMetadataSequenceId(final MessageMetadata
msgMetadata) {
+ private long updateMessageMetadataSequenceId(final MessageMetadata
msgMetadata, boolean isChunk) {
final long sequenceId;
+ // We always need to increment the value of `msgIdGenerator`,
+ // regardless of whether the user has set a sequence ID when sending a
message.
if (!msgMetadata.hasSequenceId()) {
sequenceId = msgIdGenerator++;
msgMetadata.setSequenceId(sequenceId);
+ } else if (isChunk) {
+ sequenceId = msgIdGenerator++;
Review Comment:
I see. Here, we are actually introducing a new concept, `deduplication_id,`
although it is internally still called `sequence_id`. Thi sort of conflicts
with the sequence_id concept.
```
setSequenceId(long sequenceId)
Specify a custom sequence id for the message being published.
The sequence id can be used for deduplication purposes and it needs to
follow these rules:
- sequenceId >= 0
- Sequence id for a message needs to be greater than sequence id for earlier
messages: sequenceId(N+1) > sequenceId(N)
- It's not necessary for sequence ids to be consecutive. There can be holes
between messages. Eg. the sequenceId could represent an offset or a cumulative
size.
```
Could we confirm the following cases?
case 1:
What if the user sets custom seq_id=x for one msg and then sets seq_id<=x
for the next message again? This deduplication_id(msgIdGenerator) will be still
bigger than the previous here, and the second msg will not be deduped. (Do we
have a test case for this?)
case 2:
What if one msg is chunked and the next is not?
- msg1(chunked 10 parts), cus_seqId = 1 dedupId = [1,10], next_dedupId=11
- msg2(not chunked) cus_seqId = 5, dedup_id=5, next_dedupId=6
- msg3(not chunked) cus_seqId = 8, dedup_id=8, next_dedupId=9
Here, msg2 and msg3 look to be deduped, although their cus_seqid are bigger
than prev.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]