syhily commented on pull request #17452:
URL: https://github.com/apache/flink/pull/17452#issuecomment-1039111204


   > Looks mostly good I am only a bit skeptical about the flushing behavior + 
the message delayer.
   > 
   > One quick note: For Kafka, the pending transactions block the visibility 
of new incoming records it is the same with Pulsar? In the KafkaSink we have 
implemented a mechanism to abort the potential lingering transactions.
   
   Tks for your detailed review. It seems like most of your question is just 
because you aren't familiar with Pulsar. I'll explain the flush behavior and 
message delayer in the review comments. First of all, let me explain the 
mechanism for Pulsar to abort the potential lingering transactions.
   
   ### Writing message with transaction
   
   Pulsar isn't like Kafka in writing messages with a transaction, the pending 
transactions will not block the messages written in other transactions. The 
transaction is just a small mark (`TxnID`) that can be shared with any 
producers or consumers. The producer can write messages to multiple 
transactions.
   
   The message, written in a specified Pulsar transaction, wouldn't be seen by 
the consumer until this transaction was committed. That means we don't need to 
abort the potential lingering transactions.
   
   `PulsarWriter` would abort all the pending transactions before closing the 
pipeline. If the application crashed and the pending transactions would just be 
kept in Pulsar until they meet the timeout. And this doesn't affect any new 
records which would be written in a new transaction.
   
   ### Message delayer
   
   [Delayed message 
delivery](https://pulsar.apache.org/docs/en/next/concepts-messaging/#delayed-message-delivery)
 would confuse someone like you by its naming. People would think that this 
behavior would happen on the client-side (`PulsarWriter`).
   
   But this is wrong. The message with delayed configuration would be sent to 
Pulsar async. We didn't cache the message and sent it when meet the delay time. 
This is handled by Pulsar Bookie on the server-side.
   
   Test this feature is just like testing Pulsar. I don't think we need the 
test on this feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to