geniusjoe opened a new pull request, #1462:
URL: https://github.com/apache/pulsar-client-go/pull/1462

   Fixes https://github.com/apache/pulsar-client-go/issues/1448
   
   ### Motivation
   Refer to issue:
   > When we produce a message with payload > maxChunkSize * 
MaxPendingMessages, this single message will occupy the entire partition 
producer's p.publishSemaphore and cannot be released, causing the entire 
partition producer sending progress block forever.
   
   The primary reason why the Java SDK does not have a message size limit, is 
due to its different chunk message generation strategy: 
   In Java, for each chunk split from the original message, a semaphore for one 
message is acquired and then the chunk is written to the pendingQueue for 
asynchronous sending. 
   In Go, however, the system must wait until all semaphores for the current 
message are acquired before sending the entire batch to the pendingQueue. We 
can see `testBlockIfQueueFullWhenChunking()` describe this issue in Java code:
   
   
https://github.com/apache/pulsar/blob/f0ec07b3d8c5cfe36942957fc0ad32e40d69320d/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ProducerImpl.java#L657
   
   ```Java
   for (int chunkId = 0; chunkId < totalChunks; chunkId++) {
       ...
       // check pendingQueue permit 
       if (chunkId > 0 && conf.isBlockIfQueueFull() && 
!canEnqueueRequest(callback,
               message.getSequenceId(), 0 /* The memory was already reserved 
*/)) {
           ...
           return;
       }
       // send chunk message individually 
       synchronized (this) {
           // Update the message metadata before computing the payload chunk 
size
           // to avoid a large message cannot be split into chunks.
           final long sequenceId = updateMessageMetadataSequenceId(msgMetadata);
           String uuid = totalChunks > 1 ? String.format("%s-%d", producerName, 
sequenceId) : null;
           serializeAndSendMessage(msg, payload, sequenceId, uuid, chunkId, 
totalChunks,
                   readStartIndex, payloadChunkSize, compressedPayload, 
compressed,
                   compressedPayload.readableBytes(), callback, 
chunkedMessageCtx, messageId);
           readStartIndex = ((chunkId + 1) * payloadChunkSize);
       }
   }
   ```
   
   ### Modifications
   1. Added validation for message payload size and pendingQueue in 
`pulsar/producer_partition.go#updateChunkInfo`. However, the current bugfix 
does not resolve the potential deadlock issue caused by multiple chunk messages 
refer to  
https://github.com/apache/pulsar/blob/f0ec07b3d8c5cfe36942957fc0ad32e40d69320d/pulsar-broker/src/test/java/org/apache/pulsar/client/impl/MessageChunkingTest.java#L706
   
   2. Added cleanup handling when chunk messages occupy a portion of the 
semaphore, fixing the issue where the semaphore is not released after 
destruction in `sendRequest.done()`
   
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   This change added tests and can be verified as follows:
   `TestChunkBlockIfQueueFullWithoutTimeout`
   `TestSemaphoreStateWithChunkAndTimeout`
   
   ### Does this pull request potentially affect one of the following parts:
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API: (no)
     - The schema: (no)
     - The default values of configurations: (no)
     - The wire protocol: (no)
   
   ### Documentation
     - Does this pull request introduce a new feature? (no)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to