BewareMyPower commented on PR #16797:
URL: https://github.com/apache/pulsar/pull/16797#issuecomment-1203502113

   @RobertIndie @Gleiphir2769 **In short, I think there is no need to check the 
chunk msg size.**
   
   I guess the original code of `msgMetadata.getChunkId() >= 
msgMetadata.getTotalChunkMsgSize()` might be `msgMetadata.getChunkId() >= 
msgMetadata.getNumChunksFromMsg()` to check if the chunk id is invalid. See the 
following log:
   
   
https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1439-L1441
   
   Intuitively, the "total-chunks" should be the number of chunks, but 
`getTotalChunkMsgSize()` might be somehow misleading. I believe the original 
purpose is to apply the range check for chunk id.
   
   Back to the discussion, the `total_chunk_msg_size` field is designed only 
for a better initial size of the byte buffer, see 
https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1421-L1422
   
   Only the 1st chunk's size is used.
   
   To verify if a chunk is valid, we have already used `uuid` for it 
https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1424-L1425
   
   If you are very suspicious about if the producer has set these metadata 
correctly, checking the `total_chunk_msg_size` is **far not enough**.
   
   First, we only check `num_chunks_for_msg` for whether it's a chunk.
   
   
https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1302-L1303
   
   However, what if
   - `chunk_id`, `uuid` or `total_chunk_msg_size` was not set but 
`num_chunks_from_msg` was set?
   - `uuid` fields are different among the chunks of the same message?
   - `total_chunk_msg_size` fields are different among the chunks of the same 
message?
   
   In these cases above, we should fail fast instead of only logging a line and 
skip these chunks. These error cases should be detected in the test phase. It's 
an **ERROR**, not an **EXCEPTION**.
   
   We check the order of chunk id because even if the producer has set chunk 
related metadata correctly, if there is a bug at broker side, which might be 
caused by a rare case when dispatching messages, the chunks might be out or 
order. Discarding them might be helpful to find the bug so that we can fix it 
in broker.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to