codelipenghui commented on code in PR #15413:
URL: https://github.com/apache/pulsar/pull/15413#discussion_r867344126


##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/BatchMessageContainerImpl.java:
##########
@@ -169,6 +187,10 @@ public void discard(Exception ex) {
             if (firstCallback != null) {
                 firstCallback.sendComplete(ex);
             }
+            if (batchedMessageMetadataAndPayload != null) {
+                
ReferenceCountUtil.safeRelease(batchedMessageMetadataAndPayload);
+                batchedMessageMetadataAndPayload = null;
+            }

Review Comment:
   Nice catch!
   
   Looks like we can also remove the lines 100 to 103



##########
pulsar-common/src/main/java/org/apache/pulsar/common/protocol/Commands.java:
##########
@@ -519,6 +519,8 @@ public static ByteBufPair newSend(long producerId, long 
sequenceId, long highest
                 .setSequenceId(sequenceId);
         if (highestSequenceId >= 0) {
             send.setHighestSequenceId(highestSequenceId);
+        } else {
+            send.clearHighestSequenceId();

Review Comment:
   The method `localCmd` (line: 516) already cleared all the fields, why do we 
need to clear `highestSequenceId` here?
   



##########
pulsar-broker/src/test/java/org/apache/pulsar/client/api/ClientDeduplicationTest.java:
##########
@@ -46,7 +64,7 @@ protected void cleanup() throws Exception {
         super.internalCleanup();
     }
 
-    @Test
+    @Test(priority = -1)

Review Comment:
   Any reason for setting priority to `-1`?



##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/BatchMessageKeyBasedContainer.java:
##########
@@ -115,66 +82,46 @@ public boolean isMultiBatches() {
         return true;
     }
 
-    private ProducerImpl.OpSendMsg createOpSendMsg(KeyedBatch keyedBatch) 
throws IOException {
-        ByteBuf encryptedPayload = 
producer.encryptMessage(keyedBatch.messageMetadata,
-                keyedBatch.getCompressedBatchMetadataAndPayload());
-        if (encryptedPayload.readableBytes() > ClientCnx.getMaxMessageSize()) {
-            keyedBatch.discard(new 
PulsarClientException.InvalidMessageException(
-                    "Message size is bigger than " + 
ClientCnx.getMaxMessageSize() + " bytes"));
-            return null;
-        }
-
-        final int numMessagesInBatch = keyedBatch.messages.size();
-        long currentBatchSizeBytes = 0;
-        for (MessageImpl<?> message : keyedBatch.messages) {
-            currentBatchSizeBytes += message.getDataBuffer().readableBytes();
-        }
-        keyedBatch.messageMetadata.setNumMessagesInBatch(numMessagesInBatch);
-        if (currentTxnidMostBits != -1) {
-            keyedBatch.messageMetadata.setTxnidMostBits(currentTxnidMostBits);
-        }
-        if (currentTxnidLeastBits != -1) {
-            
keyedBatch.messageMetadata.setTxnidLeastBits(currentTxnidLeastBits);
-        }
-        ByteBufPair cmd = producer.sendMessage(producer.producerId, 
keyedBatch.sequenceId, numMessagesInBatch,
-                keyedBatch.messageMetadata, encryptedPayload);
-
-        ProducerImpl.OpSendMsg op = ProducerImpl.OpSendMsg.create(
-                keyedBatch.messages, cmd, keyedBatch.sequenceId, 
keyedBatch.firstCallback);
-
-        op.setNumMessagesInBatch(numMessagesInBatch);
-        op.setBatchSizeByte(currentBatchSizeBytes);
-        return op;
-    }
-
     @Override
     public List<ProducerImpl.OpSendMsg> createOpSendMsgs() throws IOException {
-        List<ProducerImpl.OpSendMsg> result = new ArrayList<>();
-        List<KeyedBatch> list = new ArrayList<>(batches.values());
-        list.sort(((o1, o2) -> ComparisonChain.start()
-                .compare(o1.sequenceId, o2.sequenceId)
-                .result()));
-        for (KeyedBatch keyedBatch : list) {
-            ProducerImpl.OpSendMsg op = createOpSendMsg(keyedBatch);
-            if (op != null) {
-                result.add(op);
+        try {
+            // In key based batching, the sequence ids might not be ordered, 
for example,
+            // | key | sequence id list |
+            // | :-- | :--------------- |
+            // | A | 0, 3, 4 |
+            // | B | 1, 2 |
+            // The message order should be 1, 2, 0, 3, 4 so that a message 
with a sequence id <= 4 should be dropped.
+            // However, for a MessageMetadata with both `sequence_id` and 
`highest_sequence_id` fields, the broker will
+            // expect a strict order so that the batch of key "A" (0, 3, 4) 
will be dropped.
+            // Therefore, we should update the `sequence_id` field to the 
highest sequence id and remove the
+            // `highest_sequence_id` field to allow the weak order.
+            batches.values().forEach(batchMessageContainer -> {
+                
batchMessageContainer.setLowestSequenceId(batchMessageContainer.getHighestSequenceId());
+                batchMessageContainer.setHighestSequenceId(-1L);

Review Comment:
   We can remove this line? we have set the sequence ID with the 
`highestSequenceId`, we will get a same behavior between `highestSequenceId == 
-1` or `highestSequenceId = sequenceId`



##########
pulsar-client/src/main/java/org/apache/pulsar/client/impl/BatchMessageKeyBasedContainer.java:
##########
@@ -49,37 +38,21 @@
  */
 class BatchMessageKeyBasedContainer extends AbstractBatchMessageContainer {
 
-    private Map<String, KeyedBatch> batches = new HashMap<>();
+    private final Map<String, BatchMessageContainerImpl> batches = new 
HashMap<>();
 
     @Override
     public boolean add(MessageImpl<?> msg, SendCallback callback) {
         if (log.isDebugEnabled()) {
             log.debug("[{}] [{}] add message to batch, num messages in batch 
so far is {}", topicName, producerName,
                     numMessagesInBatch);
         }
-        numMessagesInBatch++;
-        currentBatchSizeBytes += msg.getDataBuffer().readableBytes();
         String key = getKey(msg);
-        KeyedBatch part = batches.get(key);
-        if (part == null) {
-            part = new KeyedBatch();
-            part.addMsg(msg, callback);
-            part.compressionType = compressionType;
-            part.compressor = compressor;
-            part.maxBatchSize = maxBatchSize;
-            part.topicName = topicName;
-            part.producerName = producerName;
-            batches.putIfAbsent(key, part);
-
-            if (msg.getMessageBuilder().hasTxnidMostBits() && 
currentTxnidMostBits == -1) {
-                currentTxnidMostBits = 
msg.getMessageBuilder().getTxnidMostBits();
-            }
-            if (msg.getMessageBuilder().hasTxnidLeastBits() && 
currentTxnidLeastBits == -1) {
-                currentTxnidLeastBits = 
msg.getMessageBuilder().getTxnidLeastBits();
-            }
-
-        } else {
-            part.addMsg(msg, callback);
+        final BatchMessageContainerImpl batchMessageContainer = 
batches.computeIfAbsent(key,
+                __ -> new BatchMessageContainerImpl(producer));
+        batchMessageContainer.add(msg, callback);
+        if (batchMessageContainer.isAdded()) {

Review Comment:
   Looks like we can use `batchMessageContainer.isEmpty()` here to avoid adding 
a new field `added` in `BatchMessageContainerImpl`? It's only for handling the 
case of encountering exceptions while constructing the message. Add we can some 
description here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to