poorbarcode commented on code in PR #20948:
URL: https://github.com/apache/pulsar/pull/20948#discussion_r1303709669


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/MessageDeduplication.java:
##########
@@ -337,12 +338,23 @@ public MessageDupStatus isDuplicate(PublishContext 
publishContext, ByteBuf heade
             publishContext.setOriginalHighestSequenceId(highestSequenceId);
             headersAndPayload.readerIndex(readerIndex);
         }
-
+        long chunkID = 0;
+        if (publishContext.isChunked()) {
+            if (md == null) {
+                headersAndPayload.markReaderIndex();

Review Comment:
   Suggested use of this way to read the metadata to avoid the buffer being 
already marked once.
   
   ```java
   int readerIndex = headersAndPayload.readerIndex();
   ....
   headersAndPayload.readerIndex(readerIndex);
   ```



##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java:
##########
@@ -1449,6 +1449,15 @@ private ByteBuf processMessageChunk(ByteBuf 
compressedPayload, MessageMetadata m
         // discard message if chunk is out-of-order
         if (chunkedMsgCtx == null || chunkedMsgCtx.chunkedMsgBuffer == null
                 || msgMetadata.getChunkId() != 
(chunkedMsgCtx.lastChunkedMessageId + 1)) {
+            // Filter duplicated chunks instead of discard it.
+            if (chunkedMsgCtx == null || msgMetadata.getChunkId() <= 
chunkedMsgCtx.lastChunkedMessageId) {

Review Comment:
   why `chunkedMsgCtx == null` here?



##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/MessageDeduplication.java:
##########
@@ -337,12 +338,23 @@ public MessageDupStatus isDuplicate(PublishContext 
publishContext, ByteBuf heade
             publishContext.setOriginalHighestSequenceId(highestSequenceId);
             headersAndPayload.readerIndex(readerIndex);
         }
-
+        long chunkID = 0;

Review Comment:
   If the message is not a chunked message, the chunk ID will be `0`, right? 
Maybe setting it to `-1` is better?



##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/MessageDeduplication.java:
##########
@@ -337,12 +338,23 @@ public MessageDupStatus isDuplicate(PublishContext 
publishContext, ByteBuf heade
             publishContext.setOriginalHighestSequenceId(highestSequenceId);
             headersAndPayload.readerIndex(readerIndex);
         }
-
+        long chunkID = 0;
+        if (publishContext.isChunked()) {
+            if (md == null) {
+                headersAndPayload.markReaderIndex();
+                md = Commands.parseMessageMetadata(headersAndPayload);
+                headersAndPayload.resetReaderIndex();
+            }
+            chunkID = md.getChunkId();
+        }
         // Synchronize the get() and subsequent put() on the map. This would 
only be relevant if the producer
         // disconnects and re-connects very quickly. At that point the call 
can be coming from a different thread
         synchronized (highestSequencedPushed) {
             Long lastSequenceIdPushed = 
highestSequencedPushed.get(producerName);
-            if (lastSequenceIdPushed != null && sequenceId <= 
lastSequenceIdPushed) {
+            // All chunks of a message use the same message metadata and 
sequence ID,
+            // so it's expected for sequenceId == lastSequenceIdPushed when 
the chunk ID > 0.
+            if (lastSequenceIdPushed != null && (chunkID > 0 ? sequenceId < 
lastSequenceIdPushed

Review Comment:
   ```suggestion
               // "chunkID == 0" means that the message is the first one of the 
chunk list.
               if (lastSequenceIdPushed != null && (chunkID > 0 ? sequenceId < 
lastSequenceIdPushed
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to