BewareMyPower commented on PR #16797: URL: https://github.com/apache/pulsar/pull/16797#issuecomment-1203502113
@RobertIndie @Gleiphir2769 **In short, I think there is no need to check the chunk msg size.** I guess the original code of `msgMetadata.getChunkId() >= msgMetadata.getTotalChunkMsgSize()` might be `msgMetadata.getChunkId() >= msgMetadata.getNumChunksFromMsg()` to check if the chunk id is invalid. See the following log: https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1439-L1441 Intuitively, the "total-chunks" should be the number of chunks, but `getTotalChunkMsgSize()` might be somehow misleading. I believe the original purpose is to apply the range check for chunk id. Back to the discussion, the `total_chunk_msg_size` field is designed only for a better initial size of the byte buffer, see https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1421-L1422 Only the 1st chunk's size is used. To verify if a chunk is valid, we have already used `uuid` for it https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1424-L1425 If you are very suspicious about if the producer has set these metadata correctly, checking the `total_chunk_msg_size` is **far not enough**. First, we only check `num_chunks_for_msg` for whether it's a chunk. https://github.com/apache/pulsar/blob/2e8bd3d7b17190d6fb45d5b35eff598948975385/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConsumerImpl.java#L1302-L1303 However, what if - `chunk_id`, `uuid` or `total_chunk_msg_size` was not set but `num_chunks_from_msg` was set? - `uuid` fields are different among the chunks of the same message? - `total_chunk_msg_size` fields are different among the chunks of the same message? In these cases above, we should fail fast instead of only logging a line and skip these chunks. These error cases should be detected in the test phase. It's an **ERROR**, not an **EXCEPTION**. We check the order of chunk id because even if the producer has set chunk related metadata correctly, if there is a bug at broker side, which might be caused by a rare case when dispatching messages, the chunks might be out or order. Discarding them might be helpful to find the bug so that we can fix it in broker. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
